HW#1: Extreme Rainfall Deficit in Singapore#
Objectives
This homework will help you gain a better understanding in terms of the ways how to:
Fit Generalized Extreme Value (GEV) distribution
Estimate the return level of extreme rainfall deficit
Happy coding!
Submission Guide
Deadline: Sunday 11:59 pm, 2nd November 2025 (Note: Late submissions will not be accepted).
Please upload your solutions to Canvas in a Jupyter Notebook format with the name “Homework1_StudentID.ipynb”. Make sure to write down your student ID and full name in the cell below.
For any questions, feel free to contact Prof. Xiaogang HE (hexg@nus.edu.sg), Kewei ZHANG (kewei_zhang@u.nus.edu) or Yifan LU (yifan_lu@u.nus.edu).
## Fill your student ID and full name below.
# Student ID:
# Full name:
Data: You will need to use the historical (1981-2020) daily total rainfall at Singapore’s Changi station for this homework. You can create a DataFrame using Pandas by reading file “../../assets/data/Changi_daily_rainfall.csv”.
Q1: Calculate daily rainfall statistics (10 marks)#
Calculate the following statistics for daily rainfall during DJF (December-January-February): (i) mean, (ii) variance, (iii) skewness, and (iv) kurtosis.
Hint:
You can filter the daily rainfall time series for DJF using Pandas’ boolean filtering method. Details on filtering values can be found in the Pandas tutorial.
DJF spans across two calendar years. Make sure you only include complete DJF seasons. For the period 1891 to 2020, this results in 39 complete DJF seasons, from DJF 1981-1982 to DJF 2019-2020.
# Your solutions go here.
# using the + icon in the toolbar to add a cell.
Q2: Preprocess daily rainfall data (20 marks)#
Find the seasonal maximum rainfall deficit for DJF, based on the 15-day centered moving average rainfall deficit.
To do this:
Compute the 15-day centered moving average of daily rainfall. (10 marks)
Calculate the daily rainfall deficit by subtracting the 15-day moving average rainfall from the mean rainfall calculated in Q1. This will be used in Q6. (5 marks)
For each DJF season, take the maximum deficit (one value per season). This yields 39 seasonal maxima that are used for Q3–Q5 (block-maxima approach). (5 marks)
# Your solutions go here.
# using the + icon in the toolbar to add a cell.
Q3: Fit the GEV distribution (20 marks)#
Fit a GEV distribution to the time series of seasonal maximum rainfall deficits. To do this, estimate the GEV parameters using (i) Maximum Likelihood and (ii) L-Moments, respectively. (Details on fitting a GEV distribution can be found in the Scipy tutorial)
# Your solutions go here.
# using the + icon in the toolbar to add a cell.
Q4: Estimate the return level of the extreme events (20 marks)#
Using the GEV parameters estimated with L-Moments in Q3, estimate the rainfall deficit for events with return periods of 10 years, 50 years, 100 years, and 1000 years.
# Your solutions go here.
# using the + icon in the toolbar to add a cell.
Q5: Test the goodness-of-fit (20 marks)#
In this task, you will compare how different distributions fit the same dataset and interpret the results using statistical analyses.
Repeat the distribution fitting as in Q3, but this time using a normal distribution and the Maximum Likelihood method. (5 marks)
Use the Kolmogorov-Smirnov (KS) test to evaluate the goodness-of-fit for both the normal distribution and the GEV distribution you obtained in Q3. (Details on the KS test can be found in the Scipy tutorial) (5 marks)
Based on the KS test results, discuss how well each distribution (Normal and GEV) fits the data. (10 marks)
Bonus (5 marks):
Plot the CDF (Cumulative Distribution Function) to visually compare the fitted normal distribution, the GEV distribution, and the empirical distribution derived from the data. Compare the behavior of the two distributions at different return periods. Are the KS statistic results consistent with your observations from the CDF plot?
Hint: You can recycle the empirical distribution estimation and CDF plotting code from the Scipy tutorial.
# Your solutions go here.
# using the + icon in the toolbar to add a cell.
Q6: Simple peaks over threshold & histogram (10 marks)#
Using the same 15-day centered moving-average deficit obtained in Q2:
Compute the 95th percentile of the pooled DJF moving-average deficit and set this as the threshold 𝑢. Report the numeric value of 𝑢. (5 marks)
Plot a histogram of the seasonal exceedance counts \(𝑘_s\) in 39 years, with a 1-2 line caption describing the distribution. Day \(i\) is considered as an exceedance if the deficit is larger than the threshold (\(x_i-u>0\)). (5 marks)
Bonus (5 marks):
Show the polyline of the seasonal mean exceedance magnitude (\(𝑦_s = \frac{1}{k_s}\sum_i \max\{x_i-u,0\}\)) as well in the graph.
# Your solutions go here.
# using the + icon in the toolbar to add a cell.