Week 6 is all about selecting models and system analysis. Basically, there are two major problems, one can encounter – underfitting or overfitting. These are also called high variance(overfit) and high bias(underfit). Overfitting can be avoided with regularization or by adding more data. Underfitting, however, cannot be fixed with more training examples. But we can add more features to the model. So for instance, a straight line would underfit house prices, but a quadratic polynomial might be a better fit.
As an example we go through the steps needed to build an e-mail spam classifier. The recommended approach is to:
- start with a quick and dirty proof-of-concept prototype
- plot learning curves
- analyze errors found in the cross-validation set and try to come up with a fix
The video lectures for the advice section:
- Deciding What to Try Next
- Evaluating a Hypothesis
- Model Selection and Train/Validation/Test Sets. We are advised to split the data in sixty percent training, twenty percent cross validation and twenty percent test. The cross validation part is used to fine tune regularization constants or to choose the degree of the polynomial in a model.
- Diagnosing Bias vs. Variance
- Regularization and Bias/Variance. The regularization constant is found by starting with 0, then 0.01, 0.02, 0.04 up to 10.24.
- Learning Curves. Another way to analyze a model is with so called learning curves, where we plot the cross-validation and training errors versus the number of training examples.
- Deciding What to Do Next Revisited
The video lectures for the system analysis section:
- Prioritizing What to Work On
- Error Analysis. We are advised to define a simple single number metric to score models.
- Error Metrics for Skewed Classes. Skewed classes are classes with uneven probability distribution. For instance, a class with more than 90 percent chance of occurring. It’s easy to cheat in such a scenario by always putting items in the most probable class.
- Trading Off Precision and Recall
- Data For Machine Learning
Paraphrasing comment found on Slashdot: Damn kids, they have it so easy with their MOOCs. I had to go to school, meet people, drink alcohol and eat bad food. Because college students can’t cook. It’s a well known fact that students can’t cook even something as simple as rice. They either drown it in water until you get something soft and sticky that clogs up your digestive system. Or they overcook it until the rice becomes so hard that it feels like you are eating gravel. Hence the drinking. Going to college is this complex optimization problem, where you have to weigh the following factors (simplified version):
- Meeting and interacting with people
- Going to classes
- Disgusting food
MOOCs have reduced this dreadful experience to watching video lectures. And I am glad. As the Singularity approaches more and more jobs will be automated and people will not even have to go to work. The robots will do all the work and at most they will need to be monitored from time to time with an iPhone app or something. Damn kids!