Modeling Temperature with the SciPy leastsq function

This entry is part 18 of 18 in the series NumPy Weather

So now we have two ideas. Either the temperature today depends on the temperature yesterday and the day before yesterday. And we assume that some kind of linear combination is formed. Or temperature depends on the day of the year (between 1 and 366). A quadratic polynomial seemed the best fit for this idea. We can combine those ideas, but then the question is how. It seems that we could have a multiplicative model or an additive model.

Let’s choose the additive model, since it seems simpler. This means that we assume that temperature is the sum of the autoregressive component and a cyclical component. It’s easy to write this down into one equation. We will use theĀ  SciPy leastsq function to minimize the square of the error of this equation. Here is the code to achieve this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
import sys
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime as dt
from scipy.optimize import leastsq
 
to_dayofyear = lambda x: dt.strptime(x, "%Y%m%d").timetuple().tm_yday
days, temp = np.loadtxt(sys.argv[1], delimiter=',', usecols=(1, 11), unpack=True, converters={1: to_dayofyear})
temp = .1 * temp
cutoff = 0.9 * len(temp)
 
def error(p, d, t, lag2, lag1):
   l2, l1, d2, d1, d0 = p
 
   return t - l2 * lag2 + l1 * lag1 + d2 * d ** 2 + d1 * d + d0
 
p0 = [-0.08293789,  1.06517683, -4.91072584e-04,   1.92682505e-01,  -3.97182941e+00]
params = leastsq(error, p0, args=(days[2:cutoff], temp[2:cutoff], temp[:cutoff - 2], temp[1 :cutoff - 1]))[0]
print params
delta = np.abs(error(params, days[cutoff+1:], temp[cutoff+1:], temp[cutoff-1:-2], temp[cutoff:-1]))
 
plt.hist(delta, bins = 10, normed = True)
plt.show()
  1. Line 12 – 15 define a function, that computes the error of our model.
  2. Line 17 gives an initial guess for all the parameters in our equation.
  3. Line 18 shows the leastsq function in action.
  4. Line 20 calculates the absolute error for the model applied above the cutoff point.
  5. Line 22 plots the histogram of the error.

The final parameters of the model are printed below. It looks like all parameters except the first one have decreased in absolute size. I don’t know if that’s coincidental, but as far as I know the order of the parameters shouldn’t matter.

[ -1.52297691e-01  -9.89195783e-01   8.20879954e-05  -3.16870659e-02
   6.06397834e-01]

The accuracy of the model doesn’t seem to be better than the simple autoregressive model with lag 2.

 photo tier2_zps904e2ca4.png
Share
Posted in programming | Tagged , | Leave a comment