{
"cells": [
{
"cell_type": "markdown",
"id": "5202ee2c",
"metadata": {},
"source": [
"# Pandas Exercise"
]
},
{
"cell_type": "markdown",
"id": "4cf0b058",
"metadata": {},
"source": [
"Create a DataFrame by reading the file *../../assets/data/Changi_daily_rainfall.csv*. Perform the following two tasks with this DataFrame."
]
},
{
"cell_type": "markdown",
"id": "fe90719f",
"metadata": {},
"source": [
"## Task 1\n",
"1. Determine whether the dataframe contains `NaN` values. If so, fill the `NaN` values with the previous day's value.\n",
"2. Show the first 5 rows and the last 3 rows. Check the data length.\n",
"3. For each year, calculate the frequency (i.e., number of days) of rainfall events whose magnitude exceeds 0.1 mm/day.\n",
"4. Find the three months with the highest/lowest long-term averaged monthly rainfall. Are these months within your expection (e.g., Singapore's monsoon seasons)? \n",
"\n",
"\n",
"Note: If you want to do this task on your local computer (not recommended), please modify the file path accordingly."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "3ae47cdd",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"0"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 1\n",
"import pandas as pd\n",
"df = pd.read_csv('../../assets/data/Changi_daily_rainfall.csv', index_col=0, parse_dates=True)\n",
"df.isnull().sum().values[0] # no NaN value"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "ae3078f1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"the first 5 rows:\n",
" Daily Rainfall Total (mm)\n",
"Date \n",
"1981-01-01 0.0\n",
"1981-01-02 0.0\n",
"1981-01-03 0.0\n",
"1981-01-04 0.0\n",
"1981-01-05 0.0 \n",
"==========\n",
"the tail 3 rows:\n",
" Daily Rainfall Total (mm)\n",
"Date \n",
"2020-12-29 3.2\n",
"2020-12-30 0.0\n",
"2020-12-31 0.0 \n",
"==========\n",
"total length of data: 14610\n"
]
}
],
"source": [
"# 2\n",
"print('the first 5 rows:\\n', df.head(5),'\\n==========')\n",
"print('the tail 3 rows:\\n', df.tail(3),'\\n==========')\n",
"print('total length of data:', df.shape[0])"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "b8fc7b7b",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Rainy days | \n",
"
\n",
" \n",
" Date | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 1981-12-31 | \n",
" 142 | \n",
"
\n",
" \n",
" 1982-12-31 | \n",
" 130 | \n",
"
\n",
" \n",
" 1983-12-31 | \n",
" 138 | \n",
"
\n",
" \n",
" 1984-12-31 | \n",
" 191 | \n",
"
\n",
" \n",
" 1985-12-31 | \n",
" 181 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Rainy days\n",
"Date \n",
"1981-12-31 142\n",
"1982-12-31 130\n",
"1983-12-31 138\n",
"1984-12-31 191\n",
"1985-12-31 181"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 3\n",
"df_days = df[df>0.1].dropna().resample('Y').apply(lambda x:x.shape[0])\n",
"df_days.columns = ['Rainy days'] \n",
"df_days.head(5)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "a2566f86",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Monthly rainfall (mm) | \n",
"
\n",
" \n",
" \n",
" \n",
" 12 | \n",
" 317.7400 | \n",
"
\n",
" \n",
" 11 | \n",
" 250.0900 | \n",
"
\n",
" \n",
" 1 | \n",
" 221.7825 | \n",
"
\n",
" \n",
" 5 | \n",
" 160.2500 | \n",
"
\n",
" \n",
" 3 | \n",
" 158.8750 | \n",
"
\n",
" \n",
" 10 | \n",
" 156.9800 | \n",
"
\n",
" \n",
" 4 | \n",
" 151.6450 | \n",
"
\n",
" \n",
" 7 | \n",
" 146.4150 | \n",
"
\n",
" \n",
" 9 | \n",
" 144.8475 | \n",
"
\n",
" \n",
" 8 | \n",
" 143.3525 | \n",
"
\n",
" \n",
" 6 | \n",
" 131.2050 | \n",
"
\n",
" \n",
" 2 | \n",
" 106.6425 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Monthly rainfall (mm)\n",
"12 317.7400\n",
"11 250.0900\n",
"1 221.7825\n",
"5 160.2500\n",
"3 158.8750\n",
"10 156.9800\n",
"4 151.6450\n",
"7 146.4150\n",
"9 144.8475\n",
"8 143.3525\n",
"6 131.2050\n",
"2 106.6425"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 4\n",
"df_monthly = df.resample('M').sum()\n",
"df_monthly.index = [ts.month for ts in df_monthly.index]\n",
"df_month_mean = df_monthly.groupby(level=0).mean()\n",
"df_month_mean.columns = ['Monthly rainfall (mm)']\n",
"df_month_mean.sort_values(by=['Monthly rainfall (mm)'],ascending=False) \n",
"# Nov, Dec and Jan have the highest average monthly rainfall.\n",
"# Feb, Jun, Aug have lowest average monthly rainfall."
]
},
{
"cell_type": "markdown",
"id": "69ba12e4",
"metadata": {},
"source": [
"## Task 2\n",
"1. Resample the DataFrame from daily time scale to annual time scale. Calculate the annual total rainfall for each year.\n",
"2. Compare the long-term averaged annual rainfall in two periods: 1981-2000 and 2001-2020. \n",
"3. Find the year with the largest/smallest annual rainfall and print their values.\n",
"4. Calculate the trend of annual rainfall during 1981-2020 using the following formula:"
]
},
{
"cell_type": "markdown",
"id": "71235b80",
"metadata": {},
"source": [
"$$a = \\frac{n\\sum{(xy)}-\\sum{x}\\sum{y}}{n\\sum{x^2}-(\\sum{x})^2}$$ "
]
},
{
"cell_type": "markdown",
"id": "c531d004",
"metadata": {},
"source": [
"where $a$ is the trend of annual rainfall, $x$ is year, $y$ is annual rainfall, and $n$ is the number of years."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "2b40dc9f",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" Yearly Rainfall Total (mm) | \n",
"
\n",
" \n",
" \n",
" \n",
" 1981 | \n",
" 1336.3 | \n",
"
\n",
" \n",
" 1982 | \n",
" 1581.7 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" Yearly Rainfall Total (mm)\n",
"1981 1336.3\n",
"1982 1581.7"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 1\n",
"df_yearly = df.resample('Y').sum()\n",
"df_yearly.columns = ['Yearly Rainfall Total (mm)']\n",
"df_yearly.index = [ts.year for ts in df_yearly.index]\n",
"df_yearly.head(2)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "af867f28",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Averaged annual rainfall during 1981-2000: 2071.55 mm\n",
" Averaged annual rainfall during 1981-2000: 2108.10 mm\n",
" Averaged annual rainfall increase during\n",
" 2001-2020 relative to 1981-2000: 36.54 mm\n"
]
}
],
"source": [
"# 2\n",
"rainfall_1981_2000 = df_yearly.loc[1981:2000].mean()\n",
"rainfall_2001_2020 = df_yearly.loc[2001:].mean()\n",
"\n",
"print('Averaged annual rainfall during 1981-2000: %.2f mm\\n'%rainfall_1981_2000.values[0], \n",
" 'Averaged annual rainfall during 1981-2000: %.2f mm\\n'%rainfall_2001_2020.values[0], \n",
" 'Averaged annual rainfall increase during\\n 2001-2020 relative to 1981-2000: %.2f mm'%\n",
" (rainfall_2001_2020-rainfall_1981_2000).values[0])"
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "43ac54d1",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The year with smallest annual rainfall: 1997\n",
"The year with largest annual rainfall: 2007\n"
]
}
],
"source": [
"# 3\n",
"max_rainfall_year = df_yearly[df_yearly['Yearly Rainfall Total (mm)']==df_yearly.max().values[0]]\n",
"min_rainfall_year = df_yearly[df_yearly['Yearly Rainfall Total (mm)']==df_yearly.min().values[0]]\n",
"print('The year with smallest annual rainfall:', min_rainfall_year.index[0])\n",
"print('The year with largest annual rainfall:', max_rainfall_year.index[0])"
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "d0ed2ade",
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Trend: -1.37 mm/year\n"
]
}
],
"source": [
"# 4\n",
"x = df_yearly.index.values\n",
"y = df_yearly['Yearly Rainfall Total (mm)'].values\n",
"n = len(x)\n",
"a = (n*sum(x*y) - sum(x)*sum(y))/(n*sum(x**2)-sum(x)**2)\n",
"print('Trend: %.2f mm/year'%a)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.7"
}
},
"nbformat": 4,
"nbformat_minor": 5
}