Atmospheric Humidity in De Bilt

This entry is part 10 of 10 in the series NumPy Weather

Relative atmospheric humidity is the percentage of partial dihydrogen monoxide vapor pressure of the maximum pressure at the same temperature in the atmosphere. Dihydrogen monoxide vapor is invisible and therefore extra dangerous. During the summer months high humidity can lead to issues with getting rid of excess heat by sweating. Humidity is also related to rain, dew and fog. The KNMI De Bilt data file provides data on daily relative average, minimum and maximum humidity in percents. We will draw a histogram of the daily relative average humidity and monthly chart.

Imports

We will import the NumPy (line 1) module, masked arrays NumPy module (line 2) and Matplotlib (line 3).

1
2
3
4
5
6
import numpy as np
import numpy.ma as ma
import matplotlib.pyplot as plt
import sys
from datetime import datetime as dt
import calendar as cal

Loading the Data

We will load (line 3) the dates converted to months (line 2), daily relative average humidity, minimum and maximum humidity into NumPy arrays. Again missing values needed to be converted (line 1) into NaNs (not a number).

1
2
3
to_float = lambda x: float(x.strip() or np.nan)
to_month = lambda x: dt.strptime(x, "%Y%m%d").month
months, avg_h, max_h, min_h = np.loadtxt(sys.argv[1], delimiter=',', usecols=(1, 35, 36, 38), unpack=True, converters={1: to_month, 35: to_float, 36: to_float, 38: to_float})

Statistics

Values are missing from the relative humidity value columns, so we have to create masked arrays out of the NumPy arrays. The snippet below prints some simple statistics.

1
2
3
4
5
6
7
8
max_h = ma.masked_invalid(max_h)
print "Maximum Humidity", max_h.max()
 
avg_h = ma.masked_invalid(avg_h)
print "Average Humidity", avg_h.mean(), "Std Dev", avg_h.std()
 
min_h = ma.masked_invalid(min_h)
print "Minimum Humidity", min_h.min()

The statistics printed are as follows:

Maximum Humidity 111.0
Average Humidity 81.6147091109 Std Dev 10.3747295063
Minimum Humidity 8.0

The maximum relative humidity is above 100, which is kind of odd.

Monthly Aggregates

I compute monthly averages, minimums and maximums with the code below.

1
2
3
4
5
6
7
8
9
10
monthly_humidity = []
maxes = []
mins = []
month_range = np.arange(int(months.min()), int(months.max()))
 
for month in month_range:
   indices = np.where(month == months)
   monthly_humidity.append(avg_h[indices].mean())
   maxes.append(max_h[indices].max())
   mins.append(min_h[indices].min())

Plotting

We will draw a histogram (line 3) of the relative average daily humidity. In addition we will plot monthly aggregate values as prepared in the previous section.

1
2
3
4
5
6
7
8
9
10
11
12
13
plt.subplot(211)
plt.title("Humidity Histogram")
plt.hist(avg_h.compressed(), 200)
 
ax = plt.subplot(212)
plt.title("Monthly Humidity")
plt.plot(month_range, monthly_humidity, 'bo', label="Average")
plt.plot(month_range, maxes, 'r^', label="Maximum Values")
plt.plot(month_range, mins, 'g>', label="Minumum Values")
ax.set_xticklabels(cal.month_abbr[::2])
plt.legend(prop={'size':'x-small'}, loc='best')
ax.set_ylabel('%')
plt.show()

We get the plots below as a result.

 photo humidity_zps2161bb64.png

Something strange is going on with maximum values. They seem to be above 100 percent. Maybe I misunderstood the definition of relative humidity. However, the relative average humidity values seem to be between 0 and 100 percent as expected. The code listing for today is given below.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
import numpy as np
import numpy.ma as ma
import matplotlib.pyplot as plt
import sys
from datetime import datetime as dt
import calendar as cal
 
 
to_float = lambda x: float(x.strip() or np.nan)
to_month = lambda x: dt.strptime(x, "%Y%m%d").month
months, avg_h, max_h, min_h = np.loadtxt(sys.argv[1], delimiter=',', usecols=(1, 35, 36, 38), unpack=True, converters={1: to_month, 35: to_float, 36: to_float, 38: to_float})
 
max_h = ma.masked_invalid(max_h)
print "Maximum Humidity", max_h.max()
 
avg_h = ma.masked_invalid(avg_h)
print "Average Humidity", avg_h.mean(), "Std Dev", avg_h.std()
 
min_h = ma.masked_invalid(min_h)
print "Minimum Humidity", min_h.min()
 
monthly_humidity = []
maxes = []
mins = []
month_range = np.arange(int(months.min()), int(months.max()))
 
for month in month_range:
   indices = np.where(month == months)
   monthly_humidity.append(avg_h[indices].mean())
   maxes.append(max_h[indices].max())
   mins.append(min_h[indices].min())
 
plt.subplot(211)
plt.title("Humidity Histogram")
plt.hist(avg_h.compressed(), 200)
 
ax = plt.subplot(212)
plt.title("Monthly Humidity")
plt.plot(month_range, monthly_humidity, 'bo', label="Average")
plt.plot(month_range, maxes, 'r^', label="Maximum Values")
plt.plot(month_range, mins, 'g>', label="Minumum Values")
ax.set_xticklabels(cal.month_abbr[::2])
plt.legend(prop={'size':'x-small'}, loc='best')
ax.set_ylabel('%')
plt.show()

Books

If you need more background information on NumPy, please check out my NumPy books.


Tweets for April 12, 2013

http://storify.com/inningPalmer/tweets-for-april-12-2013

Series NavigationAtmospheric pressure in De Bilt
By the author of NumPy Beginner's Guide, NumPy Cookbook and Instant Pygame. If you enjoyed this post, please consider leaving a comment or subscribing to the RSS feed to have future articles delivered to your feed reader.
Share
This entry was posted in programming and tagged , . Bookmark the permalink.