Bias and variance are the two terms you come across when building a machine-learning model. And these are one of those important concepts everybody learning machine learning should be aware of.
Bias is the inability of the model to adapt to fit the data points.
For example, a simple model like linear regression is a highly biased model. This linear regression model cannot fit data that is not completely linear. It doesn’t possess enough flexibility to bend to fit the data. Hence linear regression is a highly biased model. The inability to adapt is high.
On the other hand, if you look at a higher-order polynomial function, it can bend through every direction and memorize the complete data pattern. This is a low-biased model. The inability to adapt is less.
However, we cannot conclude that the best is using a low-biased model. Variance is also an equally important measure to be considered before making the final decision about the model.
Have you ever been in a situation where your model provides good results on the training data and terrible results on testing data?
In the above case, you might have chosen a low-bias model. It fits the data to the maximum almost covering every data point.
But in reality, what happens here is the model just blindly memorizes the training data instead of learning the actual pattern. Hence when you provide a new validation dataset, the results end up being horrible. This is due to high variation. Our model tries to learn everything under the training set which may be irrelevant sometimes. It will be very sensitive to noise and learn everything. This will lead to high variance. Low-bias models are prone to produce high variance.
The above case is called overfitting where bias( the error between actual and predicted/inability to adapt) is low and the variance(sensitivity to noises/ variation) is very high. This model stumbles when it meets the real-world data.
The case where the bias is too high and the variance is low is called underfitting.
To wrap up, the best model is a model that has both bias and variance as low. But having both to the lowest is unachievable as these two are contradictory. Hence we should find an optimum. For that, we have a few techniques namely regularisation, bagging and boosting.