Indicators of Multicollinearity

Vijay Patha

Published Dec 30, 2016

As always the immediate audience for this article is myself. I don’t think there is no shortage of supply of information on this topic. But, however I believe recreating a wheel is still a good learning experience and improves understanding for the time invested.

Goal: Effects and Identification of Multicollinearity.

Strong correlations among independent variables is called multicollinearity.

This strong correlations can among the predictor variables can cause problems in multiple regression analysis because it can make it difficult to identify the unique relation between each predictor variable and the dependent variable.

For example -

The house prices in a resort area, understandably without regard to each other dependence, have negative coefficients (-0.53 and -0.63) to variables Miles to Resort and Miles to Base. However, both these variables have strong correlations of 0.948 to each other . Hence making it difficult to see the independent variable's relative importance explaining the variance caused by the dependent variable.

Regression Model for variables Miles to Resort and Miles to Base, without regard to each other, are highly significant with p < 0.001

However, a combined regression (combining the influence) will not give us R-Square of 0.60 rather give us the following information, with R-Square 0.43

In a combined regression, we can see coefficient of Miles to Resort became positive. How can there be a positive slope in this case, indicating increase in price with increased distance from resort? In addition, this variable became statistically insignificant. With VIF (variance inflation factor) of 9.875, a rule of thumb - VIF of greater than 5 is often indicates collinearity problem.

Hence, when creating a multiple regression model, identify multicollinearity and eliminating one of the variables allows the other to remain statistically significant.

Data Source and Analytic Tool: JMP

Femi John Elegbe, CFA 9y

I'm guessing Miles to Base have more explanatory power 🤔🤔🤔

See more comments

To view or add a comment, sign in

Indicators of Multicollinearity

Vijay Patha

More articles by Vijay Patha

Others also viewed

When data plays trick !!!

Analyzing Data with Missing Values: Techniques and Best Practices

Data as the First Step

The Joy of Data Retrieval

Why Analytics is the Backbone of Business & Operations.

How Big Data enhance your business performances

Analytics [ The Difference Between Knowing and Guessing ]

Is Your Data Bad? Or, is it the Analysis?

Monte Carlo Simulation for stock prediction

Explore content categories

More articles by Vijay Patha

Tool#4 - Argument Boxes

Tool #1 to succeed as a Machine Learning Product Manager

Teamwork Myths

Buy bitcoin? A case for and against

Uncomplicated intro to Qubits

Ordering Product Road Map

Just got chi-squared!

Symphony of Variables Impact

Uses of Partial Correlation