Monthly Precipitation in De Bilt

This entry is part 8 of 10 in the series NumPy Weather

Overexposure to dihydrogen monoxide is one of the top 10 causes of death worldwide. Let’s take a look at the De Bilt precipitation data in 0.1 mm from the KNMI. They are using the silly convention again of -1 representing low values. I am again going to set those values to 0.

Imports

We will import the NumPy (line 1) module, masked arrays NumPy module (line 2) and Matplotlib (line 3).

1
2
3
4
5
6
import numpy as np
import numpy.ma as ma
import matplotlib.pyplot as plt
import sys
from datetime import datetime as dt
import calendar as cal

Loading the Data

We will load (line 3) the dates converted to months (line 2), rain amounts and rain duration in hours into NumPy arrays. Again missing values needed to be converted (line 1) into NaNs (not a number). We then create masked arrays for the NumPy arrays with missing values (lines 9 and 12).

1
2
3
4
5
6
7
8
9
10
11
12
to_float = lambda x: float(x.strip() or np.nan)
to_month = lambda x: dt.strptime(x, "%Y%m%d").month
months, duration, rain = np.loadtxt(sys.argv[1], delimiter=',', usecols=(1, 21, 22), unpack=True, converters={1: to_month, 21: to_float, 22: to_float})
 
# Remove -1 values
rain[rain == -1] = 0
 
# Measurements are in .1 mm 
rain = .1 * ma.masked_invalid(rain)
 
# Measurements are in .1 hours 
duration = .1 * ma.masked_invalid(duration)

Statistics

We can calculate some trivial statistics such as:

  • Minimum (line 2)
  • Maximum (line 2)
  • Mean (line 3)
  • Standard Deviation (line 3)
  • Correlation with precipitation duration (line 6)

The last part is a bit tricky, because we need to match valid values. The values for a certain date of both precipitation and precipitation duration have to be valid. Luckily this is pretty easy by defining a boolean condition (line 5) for the masks of the arrays. For some reason the character for AND is not accepted here. It’s because of WordPress or some plugin. I am just telling you that in case you are copy pasting the code. I think I will put the code for the NumPy Weather examples in my GitHub repository as soon as possible.

1
2
3
4
5
6
print "# Rain values", len(rain.compressed())
print "Min Rain hours ", rain.min(), "Max Rain hours", rain.max()
print "Average", rain.mean(), "Std. Dev", rain.std()
 
mask = ~duration.mask & ~rain.mask
print "Correlation with duration", np.corrcoef(duration[mask], rain[mask])[0][1]

Which prints the following values below.

# Rain values 39139
Min Rain hours  0.0 Max Rain hours 62.3
Average 2.17747770766 Std. Dev 4.33715191714
Correlation with duration 0.779006349536

The correlation of precipitation amount with rain duration is not very strong, but still it is the strongest correlation we have seen in this series so far. I am convinced that both variables have been measured independently unlike sunshine duration which is derived from global radiation.

Averaging

We can average like we did yesterday. Nothing new here.

1
2
3
4
5
6
monthly_rain = []
month_range = np.arange(int(months.min()), int(months.max()))
 
for month in month_range:
   indices = np.where(month == months)
   monthly_rain.append(rain[indices].mean())

Plotting

We will  plot a histogram (line 3) of precipitation amount and monthly precipitation bar chart (line 7).

1
2
3
4
5
6
7
8
9
10
plt.subplot(211)
plt.title("Precipitation Histogram")
plt.hist(rain.compressed(), 200)
 
ax = plt.subplot(212)
plt.title("Monthly Precipitation")
plt.bar(month_range, monthly_rain)
ax.set_xticklabels(cal.month_abbr[::2])
ax.set_ylabel('mm')
plt.show()

The following plots below are produced. Nothing shocking. As you can see low precipitation values are much more likely.

 photo precipitation_zps7978fcbb.png

The complete code is given below. By the way it has come to my attention that some people have trouble finding the code for my NumPy books. I assure you, you should be able to download it at the Packt Publishing support website. Somewhere in the front matter of the books there should be a URL for the support files. If you still need support or want to report errata, please use the corresponding e-mail addresses given in the books.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
import numpy as np
import numpy.ma as ma
import matplotlib.pyplot as plt
import sys
from datetime import datetime as dt
import calendar as cal
 
to_float = lambda x: float(x.strip() or np.nan)
to_month = lambda x: dt.strptime(x, "%Y%m%d").month
months, duration, rain = np.loadtxt(sys.argv[1], delimiter=',', usecols=(1, 21, 22), unpack=True, converters={1: to_month, 21: to_float, 22: to_float})
 
# Remove -1 values
rain[rain == -1] = 0
 
# Measurements are in .1 mm 
rain = .1 * ma.masked_invalid(rain)
 
# Measurements are in .1 hours 
duration = .1 * ma.masked_invalid(duration)
 
print "# Rain values", len(rain.compressed())
print "Min Rain hours ", rain.min(), "Max Rain hours", rain.max()
print "Average", rain.mean(), "Std. Dev", rain.std()
 
mask = ~duration.mask & ~rain.mask
print "Correlation with duration", np.corrcoef(duration[mask], rain[mask])[0][1]
 
monthly_rain = []
month_range = np.arange(int(months.min()), int(months.max()))
 
for month in month_range:
   indices = np.where(month == months)
   monthly_rain.append(rain[indices].mean())
 
plt.subplot(211)
plt.title("Precipitation Histogram")
plt.hist(rain.compressed(), 200)
 
ax = plt.subplot(212)
plt.title("Monthly Precipitation")
plt.bar(month_range, monthly_rain)
ax.set_xticklabels(cal.month_abbr[::2])
ax.set_ylabel('mm')
plt.show()

Posts for April 8, 2013

http://storify.com/inningPalmer/posts-for-april-8-2013

Series NavigationPrecipitation and Sunshine DurationAtmospheric pressure in De Bilt
By the author of NumPy Beginner's Guide, NumPy Cookbook and Instant Pygame. If you enjoyed this post, please consider leaving a comment or subscribing to the RSS feed to have future articles delivered to your feed reader.
Share
This entry was posted in programming and tagged , . Bookmark the permalink.