Metrics to Measure Regression Models

Course Outline

Linear Regression in Machine learning

Understanding Bias Variance Tradeoff

Regularization in Machine Learning

Metrics to Measure Regression Models

Last Updated: 22nd June, 2023

Overview

Regression models are an vital instrument for foreseeing results and recognizing vital relationships between variables. To assess a regression model, metrics such as accuracy, precision, recall, F1 score, R-squared, adjusted R-squared, mean absolute error, and root mean square error are commonly utilized.

Introduction

For example a company is trying to predict their customers’ lifetime value.

To do this, they have to construct a regression model that can recognize variables that impact client dependability and investing. To determine how well the model is performing, the company ought to utilize measurements like accuracy, mean absolute error, and root mean squared error.

Accuracy measures how regularly the demonstration gets the result right. Mean absolute error measures the distinction between the anticipated value and the actual value. Root mean squared error measures the average difference between the anticipated value and the real value.
The company can utilize these measurements to assess the execution of the show and make changes on the off chance that necessary. These metrics provide the company knowledge of how well the show is making forecasts and can offer assistance to them to choose whether it could be a great fit for their needs.

There are diverse measurements that are utilized to check in case our model is performing well or not. We will talk about that one by one in this lesson.

Mean Squared Error (MSE)

The Mean Squared Error (MSE) could be a metric utilized to assess the execution of regression models. It measures the average squared distinction between the anticipated and genuine values. The MSE is calculated by taking the sum of the squared differences between the predicted and actual values, dividing it by the number of perceptions, and after that taking the square root of the result. The lower the MSE, the superior the show is at predicting outcomes. The MSE may be a valuable metric for evaluating the precision of the model because it permits us to compare the model's execution with that of diverse models.

The Mean Squared Error (MSE) is calculated as follows:

The Mean Squared Error (MSE) is calculated as follows:

MSE = 1/N * Σ(yi - ŷi)²

Where N is the number of observations and yi and ŷi are the actual and predicted values for observation i, respectively.

The mean squared error may be a degree of the average difference between the anticipated and genuine values. It is calculated by taking the entirety of the squared contrasts between the anticipated and genuine values, dividing it by the number of perceptions, and after that taking the square root of the result. The lower the MSE, the way better the model is at anticipating results.

Root Mean Squared Error (RMSE)

The Root Mean Squared Error (RMSE) could be a metric utilized to assess the execution of regression models. It is essentially the square root of the Mean Squared Error (MSE) and is regularly utilized as a more interpretable metric than MSE, because it is communicated within the same units as the target variable.

The RMSE is calculated by taking the square root of the MSE, which is calculated by taking the sum of the squared contrasts between the anticipated and genuine values, dividing it by the number of perceptions, and after that taking the square root of the result. The lower the RMSE, the superior the show is at anticipating results. The RMSE could be a valuable metric for evaluating the exactness of the model, because it permits us to compare the model's execution with that of distinctive models.

The Root Mean Squared Error (RMSE) is calculated as follows:

RMSE = √(MSE)

Where MSE is the Mean Squared Error and is calculated by taking the entirety of the squared contrasts between the anticipated and real values, dividing it by the number of perceptions, and after that taking the square root of the result.

The RMSE may be a metric utilized to assess the execution of regression models. It is essentially the square root of the MSE and is regularly utilized as a more interpretable metric than MSE, because it is communicated within the same units as the target variable. The lower the RMSE, the way better the model is at anticipating results. The RMSE could be a valuable metric for evaluating the exactness of models because it permits us to compare the model's execution with that of distinctive models.

Mean Absolute Error (MAE)

Mean Absolute Error (MAE): The MAE is another common metric used in the regression. It measures the average absolute difference between the predicted and actual values.

The Mean Absolute Error (MAE) is calculated as follows:

MAE = 1/N * Σ|yi - ŷi|

Where N is the number of observations and yi and ŷi are the actual and predicted values for observation i, respectively.

The mean absolute error may be a degree of the average absolute difference between the anticipated and genuine values. It is calculated by taking the absolute difference between the anticipated and genuine values, and dividing it by the number of perceptions. The lower the MAE, the way better the model is at foreseeing results. The MAE may be a valuable metric for determining the exactness of a model because it permits us to compare the model's performance with that of diverse models.

R-squared

The R-squared (or coefficient of determination) is a measure of how well the regression model fits the data.It represents the proportion of variance in the target variable that can be explained by the predictor variables in the model. It is calculated as follows:

R-squared = 1 - (SSE/SST)

Where SSE is the sum of squared errors (or residuals) and SST is the total sum of squares.

The R-squared is a metric used to evaluate the performance of regression models. It represents the proportion of variance in the target variable that can be explained by the predictor variables in the model. The higher the R-squared, the better the model is at predicting the target variable. The R-squared is a useful metric for assessing the accuracy of a model, as it allows us to compare the model's performance with that of different models.

Properties of R2

R2 ranges between 0* to 1.
R2 of 0 means that there is no correlation between the dependent and the independent variable.
R2 of 1 means the dependent variable can be predicted from the independent variable without any error.
An R2 of 0.20 means that 20% variance in Y is predictable from X; an R2 of 0.40 means that 40% variance is predictable.

Note : R2 score may range from -∞ to 1 if OLS is not used to get the predictions.

Adjusted R-squared

The adjusted R-squared is a measure of the goodness of fit of a regression model that adjusts for the number of predictors in the model. The adjusted R-squared is a modified version of the R-squared that takes into account the number of predictor variables in the model. It penalizes models that include unnecessary variables that do not improve the fit of the model.It is calculated as

Adjusted R-squared = 1 - (n-1)/(n-p-1)*(1-R^2),

where n is the sample size and p is the number of predictors in the model. The adjusted R-squared can range from -∞ to 1, and its value increases as more predictors are added to the model. A high adjusted R-squared indicates that the model is a good fit for the data. The adjusted R-squared always decreases as more predictors are added to the model, even if the addition of the predictor improves the model fit. This is because the adjusted R-squared penalizes models with more predictors.

Mean Absolute Percentage Error (MAPE)

The Mean Absolute Percentage Error (MAPE) is a measure of prediction accuracy that compares the actual value of a given data point with the predicted value. It is calculated as the average of the absolute percentage errors of each data point. It is expressed as a percentage and is calculated by taking the absolute value of the difference between the actual and predicted values, dividing it by the actual value, and then multiplying it by 100. The equation for MAPE is as follows:

MAPE = (∑|Actual – Predicted|/∑ Actual) * 100

where ∑ represents the summation of the absolute differences between the actual and predicted values. The MAPE is most useful when there is a wide range of target values, as it is more sensitive to errors in the extremes of the data set. The MAPE is not appropriate for data sets with zero or near-zero values as it can give misleading results.

Mean Percentage Error (MPE):

The MPE measures the average percentage difference between the predicted and actual values. Unlike the MAPE, it does not take into account the direction of the errors. It is calculated using the following equation:

MPE = (1/n) * Σ(Predicted Value - Actual Value)/Actual Value

where n is the number of observations.

Conclusion

After assessing the performance of the regression demonstrate, the company was able to distinguish variables that impact client devotion and investing. The model was able to precisely foresee client lifetime esteem .The company was able to utilize these measurements to form changes to the demonstrate and move forward its execution. The improved model is presently able to precisely anticipate client lifetime esteem and offer assistance the company make way better choices.

Key takeaways

Mean Squared Error (MSE): This metric measures the normal of the squares of the mistakes.
Root Mean Squared Error (RMSE): This metric is the square root of the MSE and could be a more interpretable degree of the mistake.
R-Squared (R²): This metric measures the amount of variance clarified by the show.
Adjusted R-Squared (R²): This metric is adjusted for the number of indicators within the show.
Mean Absolute Error (MAE): This metric measures the average absolute difference between the anticipated values and the genuine values.

Quiz

What metric is most commonly used to evaluate the performance of a regression model?
1. Mean Absolute Error (MAE)
2. Mean Squared Error (MSE)
3. R2
4. Adjusted R2

Answer:b. Mean Squared Error (MSE)

How is the Root Mean Squared Error (RMSE) determined?
1. RMSE is the square root of the sum of the squared errors
2. RMSE is the square root of the mean of the squared errors
3. RMSE is the mean of the squared errors
4. RMSE is the sum of the squared errors

Answer: b. RMSE is the square root of the mean of the squared errors

How is the Adjusted R2 determined?
1. Adjusted R2 is the difference between the R2 and the mean squared error
2. Adjusted R2 is the difference between the R2 and the mean absolute error
3. Adjusted R2 is the ratio of the sum of the squared errors to the sum of the errors
4. Adjusted R2 is the ratio of the sum of the squared errors to the sum of the squared residuals

Answer: d. Adjusted R2 is the ratio of the sum of the squared errors to the sum of the squared residuals

What is the purpose of using a metric such as the Mean Absolute Error (MAE)?
1. To measure the accuracy of the model
2. To measure the variability of the model
3. To measure the accuracy of the predictions
4. To measure the degree of error in the model

Answer: c. To measure the accuracy of the predictions

Module 4: Regression