Day 8: MLOps Training

Arpit Pathak

Published Apr 16, 2020

1)Feature Selection in MLOps

We have studied the Filter Method of Feature Selection that uses Correlation and Variance Threshold ways to detect the Features.Now comes the EMBEDDED METHOD .

Embedded Method: If we use the coefficient concept of Feature Selection ,then it is a part of Embedded Method.Coefficients are the constants that decide the inclination of the prediction line in Linear Rgression("c" in y=b+cx).Embedded Method is required to use because Correlation is quite inappropriate in many cases.

This method Provides a higher accuracy relative to corr() .However,This method is slower than corr() because coefficients can be detected only after training the Dataset .

Rather than Training the Dataset over the Algorithm to get Coefficients ,we can use a relatively faster Embedded method called Lasso method / L1 Regularization method.

Lasso method trains the dataset over over a Feature Coefficients prediction algorithm.

>>>from sklearn.linear_model import Lasso '''importing the Lasso model'''

>>>from sklearn.feature_selection import SelectFromModel '''helps to select features from Lasso model

>>>select=SelectFromModel(Lasso()) '''creating a model for feature selection'''

>>>select.fit (x_train , y_train)

>>>select.get_support() '''gives the boolean array which says which feature is predictor and which not'''

2)Feature Engineering

Feature engineering is a subset of Data Science just like Machine Learning but not a part of Machine Learning. It is done before creating ML models or before implementing ML over the Data. In other words, Feature Engineering is a pre-processing performed over data by transforming it into a form that would make our ML model more effective and we can have better insights of Data. One of the Feature Engineering is Encoding.

Encoding is the way of tranformation of string values into integers in a particular feature/variable. It is required because if a Feature is required for prediction in Machine Learning, then it shout contain integer values in it.This is also known as Variable Encoding/Label Encoding. One of them is One Hot Encoding.

One-Hot Encoding : When we have categorial variables in our data (like gender , semester number etc) we use One-Hot encoding. It is the process of converting the Categorical variable values into seperate variables. (for ex. : if Gender variable has Male and Female values , one-hot encodes the Male and Female into seperate variables and Gender is then Removed.)

One-Hot encoding presents one issue in its process called the Dummy Variable Trap.

Dummy Variable Trap : There comes an issue in one-hot encoding that x1 variable becomes dependent on x2 variable and the x2 variable depends on that variable x1 in case of 2 variables. This is known as Multi-Collinearity where x1,x2 are duplicate variables. Due to this , during computations our model gets confused or spends high computation power in this cycle of correlation and different results can be received everytime.

To remove the issue of Dummy var. trap ,it is required to remove multi-collinearity and for that we need to remove one of the Redundant variables aur duplicate variables.

To view or add a comment, sign in

Day 8: MLOps Training

Arpit Pathak

More articles by Arpit Pathak

Others also viewed

🎨 Beyond the Algorithm: Mastering Feature Engineering in Machine Learning 🤖💡

Machine Algorithm Development Cycle: Part2 – ML Algorithms

Image Classification and Transfer Learning

How To Do Classification And Evaluation in Machine Learning?

Feature Engineering: Unlocking the True Power of Machine Learning

Navigating the Maze: A Comprehensive Guide to Debugging in Machine Learning

Decision Trees

Task #1 - Prediction using Supervised ML

Encode-Categorical-Features

Essence of Machine Learning

The Role Of Feature Engineering In Predictive Analytics

Linear Regression Models

Machine Learning Deployment Approaches

Integrating Machine Learning In Engineering Data Analysis

Explore content categories

More articles by Arpit Pathak

Deploying WordPress in Kubernetes with MySQL Database server from AWS RDS (IaaC in Terraform)

||Updated|| Deploying WordPress with MySQL in AWS VPC || Bastion Host and NAT Gateway

IAAC for deploying WordPress with MySQL in AWS VPC (Security and Availability maintained)

Introduction NaaS in AWS Cloud

Launching Cluster on AWS EKS and deploying Wordpress app with MySQL

||Updated|| AWS Cloud : Creating an Infrastructure using Terraform (AWS+GitHub+Terraform) || using EFS instead of EBS ||

My MLOps Journey at Linuxworld Trainings under the mentorship of Mr. Vimal Daga Sir

AWS Cloud : Creating an Infrastructure using Terraform (AWS+GitHub+Terraform)

Introduction to Terraform

AWS Cloud : Launching a Webserver using Docker on an EC2 instance .