Mastering Machine Learning on AWS
上QQ阅读APP看书,第一时间看更新

R-squared

Another popular metric that's used in regression problems is the R-squared score, or the coefficient of determination. This score measures the proportion of the variance in the dependent variable that is predictable from the independent variables:

Here,  represents the vector of actual values, while  and  represent the vector of predicted values. The mean actual value is . The denominator of the quotient measures how actual values typically differ from the mean, while the numerator measures how the actual values differ from the predicted values. Note that differences are squared, similar to MSE, and so large differences are penalized heavily.

In a perfect regressor, the numerator is 0, so the best possible value for R2 is 1.0. However, we can see arbitrarily large negative values when the prediction errors are significant. 

All four types of evaluation metrics are implemented in machine learning packages and are demonstrated in the following code examples.