*Numpy Strategies 0.1.3*

Unlucky Sprint 13. Scary, especially if you suffer from Triskaidekaphobia. The task for this sprint:

- Determine the steady state vector of our Markov chain model.

This task was part of the story:

- I want to know how to bet in the long term.

A very trivial task especially with NumPy. Far into the distant future or in theory infinity, the state of our Markov chain system will not change anymore. This is also called a steady state. The stochasic matrix A we calculated last time applied to the steady state, will yield the same state x. Using mathematical notation:

Ax = x

Another way to look at this is as the eigenvector for eigenvalue 1.

**1. Smoothing**

Last time we counted the number of occurrences for each state transition in our model. This gave us a bit skewed results, because in our sample a Flat or no change situation never occurred. In real life we could have a day that the close price does not change, although unlikely for liquid stock markets. One way to deal with 0 occurrences is to apply additive smoothing. The idea is to add a certain constant to the number of occurrences we find, getting rid of zeroes. Of course, we will have to normalize the probabilites afterwards. Two lines have to change due to the smoothing process. First, change this line

N = len(start_indices) |

to

N = len(start_indices) + k * NDIM |

where k is an integer smoothing constant and NDIM is the number of states, in our case 3. Second, update the line

SM[i][j] = occurrences/float(N) |

to

SM[i][j] = (occurrences + k)/float(N) |

The stochastic matrix for k=1 becomes

[[ 0.47826087 0.00869565 0.51304348] [ 0.33333333 0.33333333 0.33333333] [ 0.41134752 0.0070922 0.58156028]] |

**2. Eigenvalues and eigenvectors**

To get the eigenvalues and eigenvectors we will need the numpy linalg module and the eig function.

eig_out = numpy.linalg.eig(SM) print eig_out |

The eig function returns an array containing the eigenvalues and an array containing the eigenvectors.

(array([ 1. , 0.06739662, 0.32575786]), array([[ 0.57735027, 0.7670805 , 0.0082198 ], [ 0.57735027, -0.19564812, -0.99986103], [ 0.57735027, -0.61099044, 0.01450345]])) |

**3. Selecting the eigenvector for eigenvalue 1**

Currently we are only interested in the eigenvector for eigenvalue 1. In reality the eigenvalue might not be exactly 1, so we should build in a margin for error. We can find the index for eigenvalue between 0.9 and 1.1 as follows

idx_vec = numpy.where(numpy.abs(eig_out[0] - 1) < 0.1) print "Index eigenvalue 1", idx_vec |

We get index 0 as expected

Index eigenvalue 1 (array([0]),) |

Select the steady state vector

x = eig_out[1][:,idx_vec].flatten() print "Steady state vector", x |

The steady state vector

Steady state vector [ 0.57735027 0.57735027 0.57735027] |

**4. Check**

The check is simple – just multiply the stochastic matrix and the steady state vector we got.

print "Check", numpy.dot(SM, x) |

Seems like the result is correct. I will leave it up to you to decide whether it is meaningful or not. Task resolved.

Check [ 0.57735027 0.57735027 0.57735027] |

Here is the code in its entirety with imports and all.

#!/usr/bin/env python from matplotlib.finance import quotes_historical_yahoo from datetime import date import numpy import sys today = date.today() start = (today.year - 1, today.month, today.day) quotes = quotes_historical_yahoo(sys.argv[1], start, today) close = [q[4] for q in quotes] states = numpy.sign(numpy.diff(close)) NDIM = 3 SM = numpy.zeros((NDIM, NDIM)) signs = [-1, 0, 1] k = int(sys.argv[2]) for i in xrange(len(signs)): #we start the transition from the state with the specified sign start_indices = numpy.where(states[:-1] == signs[i])[0] N = len(start_indices) + k * NDIM # skip since there are no transitions possible if N == 0: continue #find the values of states at the end positions end_values = states[start_indices + 1] for j in xrange(len(signs)): # number of occurrences of this transition occurrences = len(end_values[end_values == signs[j]]) SM[i][j] = (occurrences + k)/float(N) print SM eig_out = numpy.linalg.eig(SM) print eig_out idx_vec = numpy.where(numpy.abs(eig_out[0] - 1) < 0.1) print "Index eigenvalue 1", idx_vec x = eig_out[1][:,idx_vec].flatten() print "Steady state vector", x print "Check", numpy.dot(SM, x) |

**Have a go**

Let’s put the Product Owner hat on. Possible improvement stories I can think of:

- Fine tune the smoothing constant.
- Introduce more states.
- Use different parameters, for instance, option price or volume.
- Try out different stocks, stock indices or a combination of stocks.
- Extend to different time periods instead of days.
- Make some unit tests.

I am looking forward to hearing about your improvements.

If you liked this post and are interested in NumPy check out NumPy Beginner’s Guide by yours truly. I would like to thank Mike Driscoll for his book review on the Mouse vs Python blog and on Amazon.com I will be back next week with your Christmas present, which will remain a surprise for now.

The correct steady state distribution is:

[ .439877551254698, 0.011573143688891, .548547409068765]

Which you can verify by taking A^k for some large k. Your steady state vector doesn’t even make sense [0.57, 0.57, 0.57]. How can the sum of the probability be greater then unity? In addition, to get a steady-state vector with equal probabilities you’ll need a strong symmetry in your underlying markov matrix (which is also not the case here).

Travis,

You are probably correct. Did you use this code? What did you change?

Thanks,

Ivan