Indicators of Multicollinearity
Lean StartUp

Indicators of Multicollinearity

As always the immediate audience for this article is myself. I don’t think there is no shortage of supply of information on this topic. But, however I believe recreating a wheel is still a good learning experience and improves understanding for the time invested. 

Goal: Effects and Identification of Multicollinearity.

Strong correlations among independent variables is called multicollinearity. 

This strong correlations can among the predictor variables can cause problems in multiple regression analysis because it can make it difficult to identify the unique relation between each predictor variable and the dependent variable. 

For example - 

The house prices in a resort area, understandably without regard to each other dependence, have negative coefficients (-0.53 and -0.63) to variables Miles to Resort and Miles to Base. However, both these variables have strong correlations of 0.948 to each other . Hence making it difficult to see the independent variable's relative importance explaining the variance caused by the dependent variable. 



Regression Model for variables Miles to Resort and Miles to Base, without regard to each other, are highly significant with p < 0.001 

However, a combined regression (combining the influence) will not give us R-Square of 0.60 rather give us the following information, with R-Square 0.43 

In a combined regression, we can see coefficient of Miles to Resort became positive. How can there be a positive slope in this case, indicating increase in price with increased distance from resort? In addition, this variable became statistically insignificant. With VIF (variance inflation factor) of 9.875, a rule of thumb - VIF of greater than 5 is often indicates collinearity problem. 

Hence, when creating a multiple regression model, identify multicollinearity and eliminating one of the variables allows the other to remain statistically significant.

Data Source and Analytic Tool: JMP

I'm guessing Miles to Base have more explanatory power 🤔🤔🤔

Like
Reply

To view or add a comment, sign in

More articles by Vijay Patha

  • Tool#4 - Argument Boxes

    * This is an excerpt from the book Machine Learning Product Manager: 10 Tools to Jumpstart your Career* Product…

  • Tool #1 to succeed as a Machine Learning Product Manager

    * This is an excerpt from the book Machine Learning Product Manager: 10 Tools to Jumpstart your Career** By reading…

    2 Comments
  • Teamwork Myths

    When was the last time you paid attention to your breathing? Why would you? For most of us, breathing is natural and…

    1 Comment
  • Buy bitcoin? A case for and against

    Your friend might have paid off her student loans from the recent surge in the bitcoin's value. Like me, you might be…

  • Uncomplicated intro to Qubits

    How would solve a maze problem 20 years from now? Here is quick analogy comparing the classical and quantum computing…

  • Ordering Product Road Map

    Product road map should be ordered not just prioritized. Prioritization is one way to order a product road map.

  • Just got chi-squared!

    One of these graph is telling a false story! The question we are attempting to answer is, "Does marital status and/or…

  • Symphony of Variables Impact

    Nothing at work is more exciting to me than to bring data driven insights that can drive clarity into business…

  • Uses of Partial Correlation

    There are many ways to accomplish our goal to develop a simplest predictive model. A common and easiest approach is to…

Others also viewed

Explore content categories