During this post, we will try to decompose an error that actually an error consists of bias and variance. We will use MSE (mean square error) to define our error. Let is parameter we want to estimate, and is our estimator result. Thus and . Let’s decompose our MSE then.
We can do because and are just a constant. And for , we can flip becomes since we will square it. And finally, the last line we can decompose that error MSE consists of variance and bias. We know that MSE is constant value, thus if we make bias lower, we will have bigger variance. And vice versa.
Bias-variance decomposition relation for machine learning
From decomposing error MSE to bias-varianace above, we know that we can make small both. If we make one smaller, the other one become bigger. It’s same in machine learning. When we train our model to get lower bias, we will have bigger variance. Here is the characteristic.
- High bias = low variance our model more flexible, stable, general. This is in the case when we set bigger constant of our regularization. Furthermore, this is more prone to underfitting.
- High variance = low bias our model more sensitive. This is in the case when we set smaller constant of our regularization. Furthermore, this is more prone to overfitting.