Support Vector Regression

Gauransh Singh

Published Jun 6, 2020

Support Vector regression is a type of Support vector machine that supports linear and non-linear regression. As it seems in the below graph, the mission is to fit as many instances as possible between the lines while limiting the margin violations. The violation concept in this example represents as ε (epsilon).

Support vector regression (SVR) is a statistical method that examines the linear relationship between two continuous variables.

In regression problems, we generally try to find a line that best fits the data provided. The equation of the line in its simplest form is described as below y=mx +c

In the case of regression using a support vector machine, we do something similar but with a slight change. Here we define a small error value e (error = prediction - actual).

The value of e determines the width of the error tube (also called insensitive tube). The value of e determines the number of support vectors, and a smaller e value indicates a lower tolerance for error.

Thus, we try to find the line’s best fit in such a way that:

(mx+c)-y ≤ e and y-(mx+c) ≤ e

Also, we do not care about errors as long, as they are less than e. So in this case, only those data points that are outside the e error region will be contributing to the final cost calculation.

For example, if we’re dealing with stock trading, and we want to minimize the trading loss, but we do not care about loss as long as they are less than a certain value (e).

Hence, the support vector regression model depends only on a subset of the training data points, as the cost function of the model ignores any training data close to the model prediction when the error is less than e.

Below are the cases where a support vector regression is advantageous over other regression algorithms:

SVM is memory efficient, which means it takes a relatively lower amount of calculation resources to train the model. This is because presenting the solution by means of a small subset of training points gives enormous computational advantages.
There are non-linear or complex relationships between features and labels. This is because we have the option to convert non-linear relationships to higher-dimensional problems in the case of support vector regression.

How to Build a Support Vector Regression Model:

Collect a training ꞇ = {X,Y}
Choose a kernel and parameter and regularization if needed. (Gaussian Kernel and noise regularization are an instance for both steps)
Form the correlation matrix:

Train your machine, exactly or approximately, to get contraction coefficient by using the main part of the algorithm.

Use this coefficient to create an estimator.

Implementation of SVR:-

For reference visit my git hub link:-

Algorithm

Suppose, we had a vector w which is always normal to the hyperplane (perpendicular to the line in 2 dimensions). We can determine how far away a sample is from our decision boundary by projecting the position vector of the sample on to the vector w. As a quick refresher, the dot product of two vectors is proportional to the projection of the first vector on to the second.

If it’s a positive sample, we’re going to insist that the proceeding decision function (the dot product of w and the position vector of a given sample plus some constant) returns a value greater than or equal to 1.

Similarly, if it’s a negative sample, we’re going to insist that the proceeding decision function returns a value smaller than or equal to -1.

Conclusion

In the real world, most problems are not linear separable. Thus, we make use of something called the kernel trick to separate the data using something other than a straight line. Stay tuned for an upcoming article where we cover this topic.

For any queries related this article, please ask.

THANK YOU

To view or add a comment, sign in

Support Vector Regression

Gauransh Singh

Implementation of SVR:-

Algorithm

Conclusion

More articles by Gauransh Singh

Explore content categories

Implementation of SVR:-

Algorithm

Conclusion

More articles by Gauransh Singh

Decision Tree

k-Nearest Neighbors

Logistic Regression

Decision Tree Regression

Linear Regression

Data Visualization in Python using Matplotlib and Seaborn

Implementation of Data Preprocessing

How to preprocess data to make Machine Learning ready using Numpy, Pandas and other libraries.

Types of Machine Learning Algorithms

Machine Learning

Explore content categories