Polynomial Models Explained (for the non-data scientist)

David Longstreet

Published Aug 3, 2015

Whenever you talk to a data scientist there will be a point in the conversation when the data scientist will probably say, "polynomial to the nth degree."

The definitions you hear are similar to the definition Data from Star Trek: The Next Generation would provide. Imagine Data saying quickly and without emotion, "The degree of a polynomial is the highest degree of its terms when the polynomial is expressed in its canonical form consisting of a linear combination of monomials." I did not make up this definition. This is the real definition provided on Wikipedia.

Purpose
The purpose of a "polynomial" is to solve problems. Developing a model using a polynomial (some equation) is an attempt to draw a line through actual observations. It is akin to connecting the dots to see if some pattern pops out. Todays programming languages allow for complex polynomials to be developed using only a few lines of code. Just because complex polynomials to the nth degree can be derived does not mean they are effective and useful.

Middle School Math
Most everyone has forgot everything they ever learned about polynomials in middle school math. Yes, you studied polynomials (and some data science) in the 8th grade. You may even had thought or said, "I will never use this in real life."

1st degree polynomial is just a straight line also known as a linear equation. It is called linear because it is a straight line. The rate of change is the slope of the line and is constant.

2nd degree polynomial is a parabola. It only has one peak or valley. You used to factor these bad boys in middle school. The rate of change varies depending on x. The rate of change or the slope is not constant along a parabola and it can be negative or positive.

3rd degree polynomial A third degree polynomial will have two peaks and valleys. The 3rd degree is also a euphemism for inflicting physical or mental pain. The rate of change is not constant. The graph can change from negative to positive several times.

Borg Queen, "You think in such three - dimensional terms. How small you've become."

Tweaking the Math or are more degrees better?
Why stop at three? Why not a fourth or fifth or sixth degree polynomial? Isn't a tenth degree polynomial more impressive than a 3rd degree polynomial or a pitiful 1 degree (straight line)?

Degrees in a polynomial are like giving directions to a person. The more degrees the more turns. The more turns the more likely a person is going to get lost. The same is true with a polynomial. With more degrees the slope changes from positive to negative, the more ups and downs in the graph. The graph can take a sudden upward spike then come back to normal.

Nth degree (or playing with matches)
By adding additional degrees to a polynomial you can get silly looking lines that curve and bend to fit each individual data point (see graphs below). The goal of hitting each data point is to minimize error and perhaps even eliminate error. This is also known as overfitting a line.

In the following two graphs, the blue dots are actual observations. In the first graph the green line is a simple 2nd degree polynomial and the red dashed line is an impressive 10th degree polynomial. Both of these polynomials were "fit" to the observations. The errors of the green line are the distances between the blue dots and the green line. The error for the 10th degree polynomial is zero because it is forced through all the data points. There is no distance between the red dashed line and the blue dots. It should be evident the 10th degree polynomial is erratic and is not valid beyond the data points. It takes a nose dive after the last data point.

In the second graph the green line is a pitiful one degree polynomial (a simple minded straight line) and the red dashed line is a 10th degree polynomial. Again, the error is the distances between the blue dots and the green line. It could be argued the straight line is a better model than the curvy red dashed line. This is true even though the amount of error for the straight line is greater than the curvy dashed line. The straight line appears to be a better predictor of the future than the 10th degree polynomial (red dashed line).

In both of these cases the polynomial of lesser degrees is going to be a better predictor than the polynomial of 10 degrees. In the notes section below, there is an example of a 30 degree polynomial compared to a straight line.

The goal is not to eliminate all error, but to build models for understanding and predicting

Weaknesses
One of the biggest weaknesses is the models breakdown outside of the data range under investigation. This is true of any polynomial regardless of the number of degrees. Keep in mind, the goal is not to eliminate all error, but to build models for understanding and predicting.

I created this beautiful model (see graph below) a few weeks ago of home runs. The point of the model is to help understand steroid usage in Major League Baseball (MLB). A problem with my model and any polynomial to the nth degree is it does not work well beyond the data in question. In the graph below, I have extended the model beyond the year 2015. The headline of my analysis could be "2067 the year no home runs are hit in Major League Baseball." We all know this is nonsense. The model is useful in helping understand steroid usage in baseball for the period of time where there is data.

Business units, sales teams, or advertisers don't care about the model.

To much of a good thing
To paraphrase Ockham's razor, The simplest models tend to be the best models. Models should be built with the fewest degrees possible. The more degrees, the more complex and the more difficult it is going to be for business units to understand and use any model. Business units, sales teams, or advertisers don't care about the model. What they care about is how the model can help them improve the business or solve a problem.

Models should be built to solve problems.

Just because complex polynomials to the nth degree can be built does not mean they are useful. There is a tendency to gravitate and falsely conclude that the more complex a model the better the model. Nothing could be further from the truth. To paraphrase the Three Little Bears, some models are too complex, some models too simple, and some models are just right.

Notes:

There is a Next Generation Star Trek episode called, The Nth Degree.

I thought about using this Star Trek quote instead of the Borg Queen quote. Spock about Khan, "He is intelligent, but not experienced. His pattern indicates 2 dimensional thinking. "

Below is a 30 degree polynomial with zero error (red dashed line). On the right side of the graph, the 30 degree polynomial becomes very erratic it jumps up then takes a nose dive. Obviously the 30 degree polynomial is not going to be a very good predictor even though it has zero error.

Kurt Cagle 10y

Nice piece, and as true when dealing with partial differential equations. Singularities don't necessarily mean that your data has a black hole in it, only that piecewise linear models aren't necessarily good predictors in that regime.

1 Reaction

Chris Kennedy 10y

Polynomials should generally be avoided in my opinion. Better to go with a GAM if you want a solid linear model: http://multithreaded.stitchfix.com/blog/2015/07/30/gam/

C.S. Ganti 10y

Thank you so much. After all is it not the degree of Polynomial but the fit itself is of much lesser degree that gives you a reasonable fit -- and we should be all happy with. While Cliche of late Prof. G.E.P. Box, of Box-Jenkins" All Models are basically wrong -- some are just more useful " Piece-wise linear approx "is it not what happens," when you are choosing higher degree -- locally with in those values of X you would fit it best . but not to the next one linear segment. Welcome corrections.

Douglas Hess, Ph.D. 10y

Very helpful. Even if the use of ST:TNG quotes dates you. :)

Sean O'Toole 10y

David, back when I was writing risk management and pricing models for commodity options, I found that our best performing models incorporated linear volatility skews. The constant curve fitting inherent with nth degree polynomial models, while attractive to risk managers, was totally inappropriate for trading.

See more comments

To view or add a comment, sign in

Polynomial Models Explained (for the non-data scientist)

David Longstreet

Notes:

More articles by David Longstreet

Others also viewed

What math(s) do you need to learn as a data scientist?

Nobel Prize Data Visualization by R using Kaggle (Notebook)

Role of Discrete Mathematics in Data Science

How to Develop a Sepsis Prediction App Using FastAPI

MNIST Data: A real-life test

Do you have a Data Scientist in you?

How to deal with huge datasets on Keras

From Science to Search: The Python-like Powered Easter Egg Hunt in Data Science

Languages of Data Science: R

Evolution of data science, machine learning and artificial intelligence in Petroleum Engineering papers

Explore content categories

Notes:

More articles by David Longstreet

Leading Data Scientists

Building Blocks of Good Data Science

Regression (for the non data scientist).

Logistic Regression (for the non-data scientist)

Modeling Steroid Usage In Major League Baseball

Unlike a parade, a trend is hard to see coming.

Baseball: Data Mining : All-Star Game

"What be your favorite political party, bucko?"

4 Steps to Good Data Science

"Liquor Nearby" : You are what you search

Others also viewed

What math(s) do you need to learn as a data scientist?

Nobel Prize Data Visualization by R using Kaggle (Notebook)

Role of Discrete Mathematics in Data Science

How to Develop a Sepsis Prediction App Using FastAPI

MNIST Data: A real-life test

Do you have a Data Scientist in you?

How to deal with huge datasets on Keras

From Science to Search: The Python-like Powered Easter Egg Hunt in Data Science

Languages of Data Science: R

Evolution of data science, machine learning and artificial intelligence in Petroleum Engineering papers

Explore content categories