The daily temperature range, or diurnal temperature variation as it is called in meteorology, is not such a big deal on Earth. In desert areas on Earth or generally on different planets the variation is greater. On Mars the variation can be life threatening. That’s one of the reasons why there aren’t that many Martians. We will have a look at the daily temperature range for the data downloaded yesterday.

**Yesterday**

A recap of yesterday’s action in case you missed it: we downloaded a data file from the Royal Dutch Meteorological Institute (KNMI). We did some super basic statistics on the minimum and maximum temperature data. By the way the record wasn’t broken. This Easter isn’t colder than the one from 1964, but we came close. I told you it doesn’t feel that cold.

**Imports**

We will analyze the weather data and plot the temperatures. For that we will need to import the numpy package (*line 1*), the numpy masked arrays module (*line 3*) and matplotlib (*lines 4 and 5*):

1 2 3 4 5 6 | import numpy as np import sys import numpy.ma as ma import matplotlib.dates as md import matplotlib.pyplot as plt from datetime import datetime as dt |

**Loading**

We will load a bit more data than yesterday (*line 4*) – dates of the measurements in the format YYYYMMDD and the average daily temperature. The dates require special conversion (*line 2*). First the date strings are converted to dates and then to numbers.

1 2 3 4 | to_float = lambda x: float(x.strip() or np.nan) to_date = lambda x: dt.strptime(x, "%Y%m%d").toordinal() dates, avg_temp, min_temp, max_temp = np.loadtxt(sys.argv[1], delimiter=',', usecols=(1, 11, 12, 14), unpack=True, converters={1: to_date, 12: to_float, 14: to_float}) |

**Freezing**

Let’s calculate the percentage of days that the minimum and maximum temperatures are below zero degrees Celsius (freezing).

1 2 | print "% days min < 0", 100 * len(min_temp[min_temp < 0])/float(len(min_temp)) print "% days max < 0", 100 * len(max_temp[max_temp < 0])/float(len(max_temp)) |

I am doing this just to satisfy my curiosity. The chance of the maximum daily temperature being below zero seems to be three percent. That’s about ten days per year. The minimum daily temperature is more likely to be below zero, with a likelihood of 18 percent. Which comes roughly to two months a year. Not consecutive months obviously.

% days min below 0 18.1944579959 % days max below 0 2.81978729632 |

** Daily Temperature Ranges**

Unfortunately we still have the problem of the missing values. One way to deal with this is to use masked arrays. Just give the masked array a mask created with the isnan function (*line 5*). We will calculate averages and standard deviations for the temperatures and the minimum and maximum for the daily temperature ranges.

1 2 3 4 5 6 7 8 9 10 11 12 13 | ranges = max_temp - min_temp print "Minimum daily range", np.nanmin(ranges) print "Maximum daily range", np.nanmax(ranges) masked_ranges = ma.array(ranges, mask = np.isnan(ranges)) print "Average daily range", masked_ranges.mean() print "Standard deviation", masked_ranges.std() masked_mins = ma.array(min_temp, mask = np.isnan(min_temp)) print "Average minimum temperature", masked_mins.mean(), "Standard deviation", masked_mins.std() masked_maxs = ma.array(max_temp, mask = np.isnan(max_temp)) print "Average maximum temperature", masked_maxs.mean(), "Standard deviation", masked_maxs.std() |

Apparently the average daily range is 8 degrees, while the average minimum is around 5 degrees and the average maximum is around 13 degrees. Fascinating, isn’t it.

Minimum daily range 0.6 Maximum daily range 22.2 Average daily range 8.20358580315 Standard deviation 3.72983839106 Average minimum temperature 5.39096231248 Standard deviation 5.85061308004 Average maximum temperature 13.5945481156 Standard deviation 7.40767291657 |

**Plotting Temperatures**

Plotting the temperatures is going to be a bit complicated because we have so much data. Matplotlib doesn’t have anything out of the box that can deal with decades of data, but we can deal with years on the x-axis. A Matplotlib YearLocator (*line 1*) can find the years. A Matplotlib DateFormatter takes care of formatting the year labels (*line 3*). I made the labels on the x-axis extra small (*line 18*), because otherwise you wouldn’t be able to read them. Also I hid some of the labels (*line 26*). I gave the temperatures different colors (*lines 21-23*) as also indicated in a legend (*line 24*).

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 | locator = md.YearLocator() date_formatter = md.DateFormatter("%Y") fig = plt.figure() ax = fig.add_subplot(211) ax.set_title("Minimum, Average and Maximum Temperatures") fig.autofmt_xdate() def hide_labels(ax): xticks = ax.xaxis.get_major_ticks() for xtick in xticks[::2]: xtick.label1.set_visible(False) ax.xaxis.set_major_locator(locator) ax.xaxis.set_major_formatter(date_formatter) ax.tick_params(which='major', labelsize='x-small') ax.autoscale_view() plt.plot(dates, min_temp, 'b-', label="Minimum") plt.plot(dates, avg_temp, 'g-', label="Average") plt.plot(dates, max_temp, 'r-', label="Maximum") plt.legend(prop={'size':'x-small'}) hide_labels(ax) |

**Daily Ranges Histogram**

Creating a histogram of the daily ranges is pretty straightforward. Just use the hist function (*line 3* ). We suppressed NaN (not a number) values with the compressed function.

1 2 3 | ax = fig.add_subplot(212) ax.set_title("Daily Temperature Ranges") ax.hist(masked_ranges.compressed(), 200, normed=True) |

I got the plot below. Click on it for the larger version.

The entire script is shown below.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 | import numpy as np import sys import numpy.ma as ma import matplotlib.dates as md import matplotlib.pyplot as plt from datetime import datetime as dt to_float = lambda x: float(x.strip() or np.nan) to_date = lambda x: dt.strptime(x, "%Y%m%d").toordinal() dates, avg_temp, min_temp, max_temp = np.loadtxt(sys.argv[1], delimiter=',', usecols=(1, 11, 12, 14), unpack=True, converters={1: to_date, 12: to_float, 14: to_float}) # Measurements are in .1 degrees Celcius avg_temp = .1 * avg_temp min_temp = .1 * min_temp max_temp = .1 * max_temp #Freezing % print "% days min below 0", 100 * len(min_temp[min_temp < 0])/float(len(min_temp)) print "% days max below 0", 100 * len(max_temp[max_temp < 0])/float(len(max_temp)) print #Daily ranges ranges = max_temp - min_temp print "Minimum daily range", np.nanmin(ranges) print "Maximum daily range", np.nanmax(ranges) masked_ranges = ma.array(ranges, mask = np.isnan(ranges)) print "Average daily range", masked_ranges.mean() print "Standard deviation", masked_ranges.std() masked_mins = ma.array(min_temp, mask = np.isnan(min_temp)) print "Average minimum temperature", masked_mins.mean(), "Standard deviation", masked_mins.std() masked_maxs = ma.array(max_temp, mask = np.isnan(max_temp)) print "Average maximum temperature", masked_maxs.mean(), "Standard deviation", masked_maxs.std() # Tick every year on January 1st locator = md.YearLocator() date_formatter = md.DateFormatter("%Y") fig = plt.figure() ax = fig.add_subplot(211) ax.set_title("Minimum, Average and Maximum Temperatures") fig.autofmt_xdate() def hide_labels(ax): xticks = ax.xaxis.get_major_ticks() for xtick in xticks[::2]: xtick.label1.set_visible(False) ax.xaxis.set_major_locator(locator) ax.xaxis.set_major_formatter(date_formatter) ax.tick_params(which='major', labelsize='x-small') ax.autoscale_view() plt.plot(dates, min_temp, 'b-', label="Minimum") plt.plot(dates, avg_temp, 'g-', label="Average") plt.plot(dates, max_temp, 'r-', label="Maximum") plt.legend(prop={'size':'x-small'}) hide_labels(ax) ax = fig.add_subplot(212) ax.set_title("Daily Temperature Ranges") ax.hist(masked_ranges.compressed(), 200, normed=True) plt.show() |

If you need more information, check out my books.

Assorted for March 31, 2013

http://storify.com/inningPalmer/assorted-for-march-31-2013