LASSO Regression, or Least Absolute Shrinkage and Selection Operator, is a statistical formula for the regularisation of data models. The Lasso regression model uses a Linear shrinkage method.
In this article, we will learn in depth about what Lasso Regression is, Lasso Regularisation, Lasso Model, Lasso Machine Learning as well as the reason why Lasso shrinks to zero.
What is regression?
Regression is a statistical method used in various fields such as finance, investing and analysis. Regression is presented graphically using straight lines with slopes to show the relation between two variables.
Regression links a dependent variable to one or more independent (explanatory) variables. A regression model can demonstrate whether variations in the dependent variable are related to variations in one or more explanatory variables.
Regression uses the linear equation of y = mx+c
Y is the dependent value,
X is the independent value,
m is the slope of the line,
c is the constant value.
Regression is a helpful tool from a statistical point of view and can also be used to make future predictions based on past observations. One can understand regression through Linear as well as Non Linear Models.
- Linear Regression
- Linear regression uses a straight line to show the relation between two variable.
- Linear regression always uses linear equations where one variable is dependent, and the other variable is explanatory.
- Non linear regression
- Non linear regression uses a curve to show the relation between variables.
- Non linear regression does not follow the concept of dependent and explanatory variables and is often very complex.
Different types of regressions have their own relevance depending upon the need. Some of them are:
- Lasso regression
- Ridge regression
- Polynomial regression
- Stepwise regression
- Elastic net regression
What is lasso regression?
Lasso regression is a type of linear regression. It is a regularisation technique that helps to make better, more accurate predictions. It uses shrinkage, where the values of data is reduced to reach a central point.
The main aim of Lasso Regression in Machine Learning is to obtain a subset of predictors that minimises prediction error for a quantitative response variable. The Lasso method uses simple models that engage fewer parameters to reach a mean.
The shrinking process through the Lasso Model makes it possible to identify variables that are closely related to variables that correspond to the target. This makes it easier for the user to understand the pattern and make better predictions.
For models with a lot of multicollinearities, Lasso regression is highly functional. It also comes in handy when you want to automate the variable selection and parameter elimination steps of the model selection process.
Regularisation techniques
Lasso regularisation and Ridge Regularisation are the two primary forms of regularisation techniques used. The difference between the two is based on the way “penalty” is applied to the coefficients.
Lasso Regularisation
Regularisation is the process through which the overfitting of variables can be avoided. By eliminating the issue of overfitting, the accuracy of the model increases. It is also called Penalised Regression Method.
Regularisation is enforced by adding the “penalty” element to the best-fit equation created by the trained data. It helps to achieve a lesser variance in tested data. Regularisation also limits the number of variables and maintains them in the model, thus, reducing the influence of variables on the output.
A penalty associated with regularisation is equal to the magnitude of the coefficient’s absolute value. With this type of regularisation, sparse models with few coefficients may be produced. It’s possible that some coefficients will shrink to zero and be removed from the model.
Lasso regularisation can also be shown as a Mathematical equation where –
Residual Sum of Squares + λ * (Sum of the absolute value of the magnitude of coefficients)
λ denotes the amount of shrinkage.
λ = 0 implies all features are considered, and it is equivalent to the linear regression where only the residual sum of squares is considered to build a predictive model
λ = ∞ implies no feature is considered i.e., as λ closes to infinity, it eliminates more and more features
The bias increases with an increase in λ
variance increases with a decrease in λ
You Must Read: Cyber security salaries in India: Earn up to ₹25 Lakhs
Lasso Regression in Python
The Lasso class from the linear model can be used in machine learning to perform Lasso regression in Python.
The Lasso class accepts a parameter named alpha that denotes the regularisation term’s strength. Fewer features are used in the model since a higher alpha value causes a more significant penalty. In other words, a value like 0.1 results in fewer features being deleted from the model than a higher alpha value like 1.0.
Along with a predict() method for making predictions on new data, the Lasso class also contains a fit() function for fitting the model to training data.
Ridge regularisation
Ridge regularisation does not result in any elimination of sparse models or coefficients. This makes it more tedious to interpret than Lasso Regularisation.
Why does Lasso shrink to zero?
The diamond-shaped lasso constraint has corners at each of its axes, so the eclipse frequently crosses each of them. As a result, at least one of the coefficients will be equal to zero.
But when it is large enough, lasso regression will shrink some coefficient estimates to 0. As such, Lasso only offers limited solutions.
When we have correlated variables, the fundamental issue with lasso regression is that it only keeps one variable and puts the other connected variables to zero. That might cause some information to be lost, which would impair the model’s accuracy.
To conclude, Lasso is not only an excellent method for predicting outcomes from a set of variables but can also help us eliminate extra variables to get more crisp data.
The linear presentation of a Lasso regression also allows the user to look at the graph and understand how one variable affects the other.