Machine Learning - The shallow end of the algorithm pool, linear regression

Machine Learning - The shallow end of the algorithm pool, linear regression

The Machine Learning Guide podcast by Tyler Renelle and the Machine Learning course on Coursera taught by Andrew Ng allows those new to Machine Learning to quickly grasp concepts once thought reserved for academics in white lab coats. A very nice way to ease into learning is to use linear regression to allow you to predict values based upon features of a data set.

Imagine having a listing of homes recently sold loaded into a spreadsheet. The matrix would have a column for the number of floors in a home, the number of square feet in the home, it's age and the selling price. Using this data and the linear regression algorithm, one can "teach" a computer how to learn to predict prices based on this data.

Below is an overly simplified model which is used to represent the process and concepts used to do this with the example scenario illustrated.

Predicting

Essentially a coefficient is choosen at random to be associated with each attribute (# of floors, square footage, age of home). The coefficients and the attribute values are multiplied and the sum of each attribute/coefficient pair are then added together. The computer then uses this sum to arrive at a predicted value for the price of a home.

Measuring

The program then looks at the predicted value compared to the actual value (price of home in this example). The difference is said to be the "Cost" or "Error Rate" for the chosen coefficient.

Adjusting

The program then uses an algorithm to take into account the error of it's prediction and adjusts the coefficient values slightly. The learning rate determines how aggressively adjustments are made. This is made possible by the magic of calculus.

Lather, rinse, repeat

The process is repeated iteratively until the measured error rate differences are so small to be insignificant. At this point the coefficient values have become your model to predict new home prices within a degree of error determined by the "Cost" value.

Adjusted to abstract to any data set with formulas:

Obviously there is a lot more detail than illustrated in this short example, but it should be a good summary to help guide you through the details. For more in depth explanations as well as instruction on how to actually write a program that does this, I suggest the following:

Brad, follow Feedzai. They are leaders in machine learning in the financial sector.

To view or add a comment, sign in

Others also viewed

Explore content categories