The Sun of course is a very important factor when it comes to temperature. Unfortunately, the De Bilt data set from the KNMI is missing a lot of data concerning sun radiation. The data is given in Joule per square centimeter. There are also other variables in the file, which are derived from the Sun radiation, such as the sunshine duration in hours.

**Today**

We are going to analyze the radiation data a bit, draw a histogram and compare it with the daily average temperatures. To compare we will calculate the correlation coefficient between radiation and temperature and plot yearly relative changes in average temperature and radiation. Originally I wanted to have a scatter plot, but that didn’t look right with thousands of data points, so I decided to compress the data as it were. Later I realized that radiation was present from around 1960 on, so it might have been better to plot the correlations coefficient for each year. This is left as an exercise for the reader. If you want to share your code, please put a link in the comments below.

**Imports**

We need to import NumPy (*line 1*), the NumPy masked array module (*line 2*) and Matplotlib (*line 3*).

1 2 3 4 5 | import numpy as np import numpy.ma as ma import matplotlib.pyplot as plt import sys from datetime import datetime as dt |

** Loading the Data**

We will load the dates and convert them to years (*line 2*), as well as the average temperature and radiation. The latter misses a lot of values so we will convert (*line 1*) the missing values to NaN (not a number) and then create a masked array out of the radiation data.

1 2 3 4 5 6 7 8 9 | to_float = lambda x: float(x.strip() or np.nan) to_year = lambda x: dt.strptime(x, "%Y%m%d").year years, avg_temp, Q = np.loadtxt(sys.argv[1], delimiter=',', usecols=(1, 11, 20), unpack=True, converters={1: to_year, 20: to_float}) ma # Measurements are in .1 degrees Celcius avg_temp = .1 * avg_temp Q = ma.masked_invalid(Q) |

**Simple Statistics**

We will have a look at the minimum, maximum, mean and standard deviation of the radiation. Additionally we will print the correlation coefficient of temperature and radiation. To compute the coefficient we need to match (*line 5*) the data properly by avoiding the NaN values. Also we have to get one of the off-diagonal values (*line 6*) of the correlation matrix that NumPy returns.

1 2 3 4 5 6 | print "# temperature values", len(avg_temp), "# radiation values", len(Q.compressed()) print "Radiation Min", Q.min(), "Radiation Max", Q.max() print "Radiation Average", Q.compressed().mean(), "Std Dev", Q.std() match_temp = avg_temp[np.logical_not(np.isnan(Q))] print "Correlation Coefficient", np.corrcoef(match_temp, Q.compressed())[0][1] |

The script prints the following:

# temperature values 40996 # radiation values 20361 Radiation Min 7.0 Radiation Max 3081.0 Radiation Average 957.156082707 Std Dev 740.68047373 Correlation Coefficient 0.62767320286 |

As you can see the correlation is not that strong.

**Yearly Averages and Relative Changes**

We did the yearly averaging yesterday already. Today we add radiation (*line 8*) to be averaged yearly. I will skip the details today. Please refer to yesterday’s post if necessary. Another thing that we want to do today is calculate the relative change of the variables we are interested in as percentages. The diff function (*line 11*) gives us the first order difference between neighboring array values.

1 2 3 4 5 6 7 8 9 10 11 | avg_temps = [] avg_qs = [] year_range = range(int(years[0]), int(years[-1]) - 1) for year in year_range: indices = np.where(years == year) avg_temps.append(avg_temp[indices].mean()) avg_qs.append(Q[indices].mean()) def percents(a): return 100 * np.diff(a)/a[:-1] |

** Plotting Radiation Histogram and Yearly Changes**

We will plot the radiation histogram and relative changes in yearly average temperature and radiation with Matplotlib.

1 2 3 4 5 6 7 8 9 10 | plt.subplot(211) plt.title("Global Radiation Histogram") plt.hist(Q.compressed(), 200) plt.subplot(212) plt.title("Changes in Average Yearly Temperature & Radiation") plt.plot(year_range[1:], percents(avg_temps), label='% Change Temperature') plt.plot(year_range[1:], percents(avg_qs), label='% Change Radiation') plt.legend(prop={'size':'x-small'}) plt.show() |

We get the plots below.

The radiation histogram looks interesting. It reminds me of something, but I am not sure what. The code for this script is shown below.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | import numpy as np import numpy.ma as ma import matplotlib.pyplot as plt import sys from datetime import datetime as dt to_float = lambda x: float(x.strip() or np.nan) to_year = lambda x: dt.strptime(x, "%Y%m%d").year years, avg_temp, Q = np.loadtxt(sys.argv[1], delimiter=',', usecols=(1, 11, 20), unpack=True, converters={1: to_year, 20: to_float}) ma # Measurements are in .1 degrees Celcius avg_temp = .1 * avg_temp Q = ma.masked_invalid(Q) print "# temperature values", len(avg_temp), "# radiation values", len(Q.compressed()) print "Radiation Min", Q.min(), "Radiation Max", Q.max() print "Radiation Average", Q.compressed().mean(), "Std Dev", Q.std() match_temp = avg_temp[np.logical_not(np.isnan(Q))] print "Correlation Coefficient", np.corrcoef(match_temp, Q.compressed())[0][1] avg_temps = [] avg_qs = [] year_range = range(int(years[0]), int(years[-1]) - 1) for year in year_range: indices = np.where(years == year) avg_temps.append(avg_temp[indices].mean()) avg_qs.append(Q[indices].mean()) def percents(a): return 100 * np.diff(a)/a[:-1] plt.subplot(211) plt.title("Global Radiation Histogram") plt.hist(Q.compressed(), 200) plt.subplot(212) plt.title("Changes in Average Yearly Temperature & Radiation") plt.plot(year_range[1:], percents(avg_temps), label='% Change Temperature') plt.plot(year_range[1:], percents(avg_qs), label='% Change Radiation') plt.legend(prop={'size':'x-small'}) plt.show() |

**Storytime**

One day when I was still in high school, I was waiting at the tram stop for the tram to arrive. A guy started talking to me about his “expanding Earth” theory. I don’t remember the details any more, but the main point was that the temperature change on Earth is due to the Earth expanding. The guy smelled of booze, his clothes were dirty and he obviously didn’t shave that day or that year even. But his theory might have been correct. In later years I encountered other people who wanted to share their “Theory Of Everything”. I have a theory too, it involves a particle called the “Ivan Idris boson”. If I meet you at a tram stop, I will tell you about it.

Latest on April 4, 2013

http://storify.com/inningPalmer/latest-on-april-4-2013