Multiple Linear Regression
morioh.com

Multiple Linear Regression

Hello everyone today we learn about machine learning algorithm. As heading suggest we learn about Multiple linear regression. A linear regression simply shows the relationship  between the dependent variable and the independent variable.

  1. It comes under the supervised learning and It is an extension of the Simple Linear Regression.
  2. Two or more Independent variables (x1, x2, …xn) are used to predict or explain the variance in Y – the dependent variable.
No alt text provided for this image

We work on a dateset name as hiring.csv This file contains hiring statics for a firm such as experience of candidate, his written test score and personal interview score. Based on these 3 factors. By this data set we build a modal, modal will be predict the salary of candidate. Using this predict salaries for following candidates, Candidate 1.) 2 yr experience, 9 test score, 6 interview score Candidate 2.)12 yr experience, 10 test score, 10 interview score Candidate 3.)15 yr experience, 9 test score, 9 interview score.

First we import some Library files like numpy, pandas, sklearn.

  • Numpy - This library used for numpy used for perform a number of mathematical operations on arrays such as trigonometric, statistical and algebraic routines.
  • Pandas - pandas used for data analysis and it provides highly optimized performance.
  • Sklearn - sklearn provide some built in function like train_test_split by using this we divide our data set in two part first train and second for testing.
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
import numpy as np

  • Here we read our csv file by the function pd.read_csv()
  • Now we print data types in data frame, data type column wise show in image.
  • Now we print keys of our data frame
  • Then we print df
No alt text provided for this image
  • Here we convert word to number and store in the cleanup_nums
  • then pass it in the function df.replace() and print df
No alt text provided for this image
  • Now we replace null(NaN) values from the df.
  • In column "experience" we fill 0 by the place of NaN.
  • And in "test_score(out of 10)" fill NaN by the mean of the test_score(out of 10).
No alt text provided for this image
  • We use Linear Regression function to fit our model and we store in the variable named as model.
  • Now we fit it by passing argument x, y
  • Now we find y_pred by formula y = mx+c
  • And by the defined function model.predict
  • And by this we find the salary of Candidate 1,2,3. Here we have a function named as hiring in this we pass experience,test_score,interview_score.
  • Now we pass all these argument in the y_pred and then print it.
No alt text provided for this image

Now pass the values of Candidate 1 2 yr experience, 9 test score, 6 interview score

No alt text provided for this image

Candidate 2.) 12 yr experience, 10 test score, 10 interview score

No alt text provided for this image

Candidate 3.) 15 yr experience, 9 test score, 9 interview score

No alt text provided for this image


To view or add a comment, sign in

More articles by Priyanshu Lasod

  • Recommendation Systems

    A recommender system is a simple algorithm whose aim is to provide the most relevant information to a user by…

    2 Comments
  • Employee Retention

    Today i work on a project on Employee Retention and i use data from the dataset HR_comma_sep.csv.

Others also viewed

Explore content categories