Coursera Machine Learning Weeks 1 and 2

I know about machines. I know about learning. I know programming, calculus and linear algebra. I have read the “Machine Learning” book by Tom Mitchell. Here is my book review along with reviews of two other books. Still I was interested in the Coursera Machine Learning course. And I don’t regret signing up at all.

Coursera offers online courses given by the best university lecturers. This particular course is by Andrew Ng from Stanford University. The video lectures for weeks 1 and 2 are organized in five sections:

1. Introduction

Which I skipped, because usually introductions don’t have much interesting to offer.

2. Linear regression with one variable

We start out with a simple linear model of house prices. Linear cost functions and gradient descents are covered. Gradient descent can be easily visualized in three dimensions. The goal is to find a local minimum. We can find it by taking random baby steps. The size of the baby steps is governed by a parameter called the learning rate.

3 Linear Algebra Review

I skipped this section as well. We got Linear Algebra in the very first semester, when I studied Physics. Linear Algebra is not that hard in my very humble opinion.

4 Linear regression with multiple variables

Multiple variables means multiple features of the model. We discuss gradient descent again, but extended to multiple features. Heuristic tips are given about how to keep the size of feature values within an acceptable range (feature scaling) and how to determine the optimal learning rate. The choice of learning rate is a bit tricky. If the learning rate is too small, you will have to take a lot of baby steps, so the algorithm will be slow. If the learning rate is too large, you will be taking giant steps and may potentially miss any local minimum. Polynomial regression is also mentioned with the note that we will get advice on how and if to implement that in future lectures (can’t wait).

At the end the normal equation is covered. The normal equation is an equation that doesn’t require parameters. It requires transposing feature matrices, matrix multiplication and (pseudo) inverting matrices. Pseudo inverting refers to the fact that not all matrices can be inverted, so we need to pseudo invert instead.

Since matrix operations are expensive gradient descent is preferred over normal equations for large feature matrices. However, for small matrices you should go with the normal equation, because you wouldn’t need to determine a learning rate. And it’s just plain easier.

Octave Tutorial

The professor tries to “sell” Octave (open source Matlab-like package) to us in the last section. I don’t agree with his recommendation. IPython in Pylab mode with NumPy, SciPy and Matplotlib works better. And you get the benefit of being able to take advantage of the Python libraries and ecosystem. A lot of the functions demonstrated in the Octave tutorial, have a counterpart in NumPy with the same name, that do exactly the same thing. For instance ones, log, exp, randn, eye and so on. The lecturer didn’t mention Excel during his sales pitch, I believe. I guess it’s not that accepted in academia any more.

I was thinking of trying to predict stock ratings with machine learning. The features would be the average and standard deviations of:

  • The close price returns.
  • Volume.
  • Daily relative spread of high and low.
  • Daily relative spread of open and close.

This gives us eight features in total. According to the lecture we will then need to have at least nine training examples. And it’s recommended to use the normal equation.

Associated Press hacked, Apple earnings report and Mars One

Yesterday was a huge day for traders and investors. First during the trading day Associated Press tweeted about the White House being attacked. The tweet claimed that Obama was hurt in the incident. Lots of people actually believed this message and the news was quickly spread around. As a result the stock market experienced yet another flash crash. In a matter of minutes equities lost a big percentage of their value. Later it was discovered that it was all a hoax perpetrated by a group of hackers. Stocks quickly recovered after that. So now the hope is that Twitter will finally do something. For example, offer two-factor authentication.

After the market closed Apple reported earnings. After hours trading was suspended for about fifty minutes. When trading reopened, Apple stock gained about five percent and after a while this gain disappeared. Apple announced an increase of dividend and stock buyback. No new products were announced. However, Tim Cook, the current CEO of Apple promised exciting new products later in the year and in 2014. Fingers crossed.

Mars One, the Dutch non-profit organization announced that they are accepting applications for their project of sending people to Mars. Applicants need to be older than 18 and will be extensively tested both physically and psychologically. It seems that they want to have a televised popularity contest first. Currently there seem to be about two hundred contestants who uploaded a promotional video to the Mars One website. This is required for the first round. I am thinking of doing that too. Just for fun. However, I need some kind of confirmation, that people will support me. I guess it’s the perfect opportunity to promote my books 🙂

By the author of NumPy Beginner's Guide, NumPy Cookbook and Instant Pygame. If you enjoyed this post, please consider leaving a comment or subscribing to the RSS feed to have future articles delivered to your feed reader.
This entry was posted in programming. Bookmark the permalink.