Precipitation and Sunshine Duration

This entry is part 7 of 10 in the series NumPy Weather

Dihydrogen Monoxide is a dangerous liquid, therefore exposure to it has to be kept to a minimum at all times. The KNMI De Bilt data file has a column containing precipitation duration values in 0.1 hours. Presumably this was timed with stopwatches or similar devices. The sunshine duration also given in 0.1 hours is derived from global radiation values. Notice the use of the word “global” and not “solar”. Hence there are other sources of radiation taken into account here, but unfortunately I can’t tell you the details. Today we will plot a histogram of precipitation duration values. However, we will omit the days that no dihydrogen monoxide fell, because there are so many dry days that it skews the full overall picture. We will also display the monthly average precipitation and sunshine durations.

Imports

We need to import the same modules we imported the previous time – NumPy (line 1), the NumPy masked arrays (line 2) module and Matplotlib (line 3).

1
2
3
4
5
6
import numpy as np
import numpy.ma as ma
import matplotlib.pyplot as plt
import sys
from datetime import datetime as dt
import calendar as cal

Loading the Data

We will load the  dates converted into months (line 2), sunshine and precipitation duration into NumPy arrays. Again we convert missing values (line 1) to NaN (not a number).

1
2
3
to_float = lambda x: float(x.strip() or np.nan)
to_month = lambda x: dt.strptime(x, "%Y%m%d").month
months, sun_hours, rain_hours = np.loadtxt(sys.argv[1], delimiter=',', usecols=(1, 18, 21), unpack=True, converters={1: to_month, 18: to_float, 21: to_float})

Statistics

Before calculating basic statistics for the precipitation duration, we will create masked arrays for the sunshine and rain duration. There is a minor detail to take care of. Low values of sunshine duration are written down as -1 for some reason. I decided to convert those values to 0. It might have been better to completely ignore them.

1
2
3
4
5
6
7
8
9
10
11
12
# Measurements are in .1 hours 
rain_hours = .1 * ma.masked_invalid(rain_hours)
 
#Get rid of -1 values
print "# -1 values Before", len(sun_hours[sun_hours == -1])
sun_hours[sun_hours == -1] = 0
print "# -1 values After", len(sun_hours[sun_hours == -1])
sun_hours = .1 * ma.masked_invalid(sun_hours)
 
print "# Rain hours values", len(rain_hours.compressed())
print "Min Rain hours ", rain_hours.min(), "Max Rain hours", rain_hours.max()
print "Average", rain_hours.mean(), "Std. Dev", rain_hours.std()

Which prints the following output below:

# -1 values Before 832
# -1 values After 0
# Rain hours values 30373
Min Rain hours  0.0 Max Rain hours 24.0
Average 1.65149639482 Std. Dev 2.78643269679

As expected the rain duration can be between 0 and 24 hours or a full day.

Averaging

We can average the sunshine and precipitation duration values quite easily over the months. First, we create a numerical range for the months (line 3). Second, we find the array indices corresponding to each month (line 6). Then, we use the indices to select duration values (lines 7 and 8).

1
2
3
4
5
6
7
8
monthly_rain = []
monthly_sun = []
month_range = np.arange(int(months.min()), int(months.max()))
 
for month in month_range:
   indices = np.where(month == months)
   monthly_rain.append(rain_hours[indices].mean())
   monthly_sun.append(sun_hours[indices].mean())

Plotting

The number of dry days is quite high, so we will leave them out in the precipitation duration histogram. We will plot bar charts of the average monthly rain (line 8) and sunshine durations (line 9).

1
2
3
4
5
6
7
8
9
10
11
12
13
plt.subplot(211)
plt.title("Precipitation Duration Histogram")
plt.hist(rain_hours[rain_hours > 0].compressed(), 200)
 
width = 0.42
ax = plt.subplot(212)
plt.title("Monthly Precipitation Duration")
plt.bar(month_range, monthly_rain, width, label='Rain Hours')
plt.bar(month_range + width, monthly_sun, width, color='red', label='Sun Hours')
plt.legend()
ax.set_xticklabels(cal.month_abbr[::2])
ax.set_ylabel('Hours')
plt.show()

This gives us the exciting plots shown below.

 photo rainSunDuration_zps6cd6a5b3.png

Pretty brilliant, eh? It seems that sunshine and precipitation duration are inversely correlated. So there must be an inverse correlation with temperatures based on the previous evidence in this series. We leave that as an exercise for the reader to check that. Obviously the rain duration is limited between 0 and 24 hours, with lower values being much more likely. We can see clearly that in the summer months, the sun shines longer and it rains less (duration-wise). Similar conclusions can be drawn for other seasons. The full code for this tutorial is given below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
import numpy as np
import numpy.ma as ma
import matplotlib.pyplot as plt
import sys
from datetime import datetime as dt
import calendar as cal
 
to_float = lambda x: float(x.strip() or np.nan)
to_month = lambda x: dt.strptime(x, "%Y%m%d").month
months, sun_hours, rain_hours = np.loadtxt(sys.argv[1], delimiter=',', usecols=(1, 18, 21), unpack=True, converters={1: to_month, 18: to_float, 21: to_float})
 
# Measurements are in .1 hours 
rain_hours = .1 * ma.masked_invalid(rain_hours)
 
#Get rid of -1 values
print "# -1 values Before", len(sun_hours[sun_hours == -1])
sun_hours[sun_hours == -1] = 0
print "# -1 values After", len(sun_hours[sun_hours == -1])
sun_hours = .1 * ma.masked_invalid(sun_hours)
 
print "# Rain hours values", len(rain_hours.compressed())
print "Min Rain hours ", rain_hours.min(), "Max Rain hours", rain_hours.max()
print "Average", rain_hours.mean(), "Std. Dev", rain_hours.std()
 
monthly_rain = []
monthly_sun = []
month_range = np.arange(int(months.min()), int(months.max()))
 
for month in month_range:
   indices = np.where(month == months)
   monthly_rain.append(rain_hours[indices].mean())
   monthly_sun.append(sun_hours[indices].mean())
 
plt.subplot(211)
plt.title("Precipitation Duration Histogram")
plt.hist(rain_hours[rain_hours > 0].compressed(), 200)
 
width = 0.42
ax = plt.subplot(212)
plt.title("Monthly Precipitation Duration")
plt.bar(month_range, monthly_rain, width, label='Rain Hours')
plt.bar(month_range + width, monthly_sun, width, color='red', label='Sun Hours')
plt.legend()
ax.set_xticklabels(cal.month_abbr[::2])
ax.set_ylabel('Hours')
plt.show()

Resources for April 7, 2013

http://storify.com/inningPalmer/resources-for-april-7-2013

 

Series NavigationWind SpeedMonthly Precipitation in De Bilt
By the author of NumPy Beginner's Guide, NumPy Cookbook and Instant Pygame. If you enjoyed this post, please consider leaving a comment or subscribing to the RSS feed to have future articles delivered to your feed reader.
Share
This entry was posted in programming and tagged , . Bookmark the permalink.