Support Vector Regression

Support Vector Regression

Support Vector regression is a type of Support vector machine that supports linear and non-linear regression. As it seems in the below graph, the mission is to fit as many instances as possible between the lines while limiting the margin violations. The violation concept in this example represents as ε (epsilon).

No alt text provided for this image

Support vector regression (SVR) is a statistical method that examines the linear relationship between two continuous variables.

In regression problems, we generally try to find a line that best fits the data provided. The equation of the line in its simplest form is described as below y=mx +c

In the case of regression using a support vector machine, we do something similar but with a slight change. Here we define a small error value e (error = prediction - actual).

The value of e determines the width of the error tube (also called insensitive tube). The value of e determines the number of support vectors, and a smaller value indicates a lower tolerance for error.

Thus, we try to find the line’s best fit in such a way that:

(mx+c)-y ≤ e and y-(mx+c) ≤ e

Also, we do not care about errors as long, as they are less than e. So in this case, only those data points that are outside the e error region will be contributing to the final cost calculation.

For example, if we’re dealing with stock trading, and we want to minimize the trading loss, but we do not care about loss as long as they are less than a certain value (e).

Hence, the support vector regression model depends only on a subset of the training data points, as the cost function of the model ignores any training data close to the model prediction when the error is less than e.

Below are the cases where a support vector regression is advantageous over other regression algorithms:

  1. SVM is memory efficient, which means it takes a relatively lower amount of calculation resources to train the model. This is because presenting the solution by means of a small subset of training points gives enormous computational advantages.
  2. There are non-linear or complex relationships between features and labels. This is because we have the option to convert non-linear relationships to higher-dimensional problems in the case of support vector regression.

How to Build a Support Vector Regression Model:

  • Collect a training ꞇ = {X,Y}
  • Choose a kernel and parameter and regularization if needed. (Gaussian Kernel and noise regularization are an instance for both steps)
  • Form the correlation matrix:
No alt text provided for this image

Train your machine, exactly or approximately, to get contraction coefficient by using the main part of the algorithm.

No alt text provided for this image

Use this coefficient to create an estimator.

No alt text provided for this image

Implementation of SVR:-

For reference visit my git hub link:-

Algorithm

Suppose, we had a vector which is always normal to the hyperplane (perpendicular to the line in 2 dimensions). We can determine how far away a sample is from our decision boundary by projecting the position vector of the sample on to the vector w. As a quick refresher, the dot product of two vectors is proportional to the projection of the first vector on to the second.

No alt text provided for this image

If it’s a positive sample, we’re going to insist that the proceeding decision function (the dot product of w and the position vector of a given sample plus some constant) returns a value greater than or equal to 1.

No alt text provided for this image
No alt text provided for this image

Similarly, if it’s a negative sample, we’re going to insist that the proceeding decision function returns a value smaller than or equal to -1.

No alt text provided for this image
No alt text provided for this image

Conclusion

In the real world, most problems are not linear separable. Thus, we make use of something called the kernel trick to separate the data using something other than a straight line. Stay tuned for an upcoming article where we cover this topic.

For any queries related this article, please ask.

THANK YOU




To view or add a comment, sign in

More articles by Gauransh Singh

  • Decision Tree

    Decision tree learning is one of the predictive modelling approaches used in statistics, data mining and machine…

    2 Comments
  • k-Nearest Neighbors

    In pattern recognition, the k-nearest neighbors algorithm (k-NN) is a non-parametric method used for classification and…

    2 Comments
  • Logistic Regression

    Logistic regression is similar to linear regression because both of these involve estimating the values of parameters…

  • Decision Tree Regression

    For reference how to implement decision tree regression refer to my git hub link:- Decision tree builds regression or…

  • Linear Regression

    Linear regression is perhaps one of the most well known and well understood algorithms in statistics and machine…

  • Data Visualization in Python using Matplotlib and Seaborn

    Data Visualization in Python using Matplotlib What is matplotlib? Matplotlib is a Python 2D plotting library which…

    2 Comments
  • Implementation of Data Preprocessing

    Why do we need to do Preprocessing ? For machine learning algorithms to work, it is necessary to convert the raw data…

  • How to preprocess data to make Machine Learning ready using Numpy, Pandas and other libraries.

    Data Preprocessing refers to the steps applied to make data more suitable for data mining. The steps used for Data…

  • Types of Machine Learning Algorithms

    There some variations of how to define the types of Machine Learning Algorithms but commonly they can be divided into…

    2 Comments
  • Machine Learning

    What is Machine Learning? Machine Learning is a sub-area of artificial intelligence, whereby the term refers to the…

    2 Comments

Explore content categories