Bias and Variance Tradeoff

Bias and Variance Tradeoff

In the process of project building, bias and variance are two fundamental concepts that help us understand the behaviour and performance of predictive models.

Bias

Bias is the inability of a machine learning model to fit the training data, which means how much the predictions of a model deviate from the true values we are trying to predict.
A high-bias model tends to underfit the data, meaning it fails to capture the underlying patterns and relationships present in the data.

Variance

Variance is defined as how much predictions of a model vary when the training data changes (for different datasets). It shows the model’s

sensitivity to small fluctuations or noise in the training data. Model with high variance are overly complex and tend to capture not only the underlying patterns but also the noise present in the training data. As a result, they perform well on the training data but poorly on new, unseen data.

Let’ Take a Real World Example

Let’s imagine we built a machine-learning model to predict the output values. There may be some error while predicting the output.

We can decompose the above error while output into three parts:

Error due to bias
Error due to variance
Irreducible error

Decomposing the Error:

Imagine you are a party planner trying to guess how much food to order for your guests. Here’s how the different errors in your prediction would play out:

Bias This is like consistently underestimating how much your friends eat. Maybe you forget a key factor like there’s always a big group with healthy appetites, so you always order too little food (your predictions are off)

Variance This is like your prediction all over the place. Sometimes you order just the right amount, but other times you overestimate or underestimate. This happens because you are too focused on what your friends ate last time, which might not be a good indicator of how much they will eat this time.

Irreducible Error This is like unexpected things happening like someone bringing a surprise dish or a guest with a smaller appetite than usual.

End Goal

Our goal is to make an overall good guess, which have less bias and less variance.

The Bias-Variance Tradeoff:

The key takeaway is that bias and variance have a tradeoff. Reducing one often leads to an increase in the other.
A high-variance model has low variance(undercutting), while a high-variance model has low bias(overfitting). The ideal scenario lies in achieving a balance between the two – a model with low bias and low variance. This sweet spot minimizes the overall error, leading to accurate and generalizable predictions on unseen data.

Conclusion

Today, we learn about the concept of bias and variance tradeoffs including its meaning with real-life examples.

Stay tuned for more topics around this topic. Next we will dive deep into regularization and its type.

Leave a Reply

Your email address will not be published. Required fields are marked *