Model Fit, Bias and Variance
Model Fit
- Overfitting
- Performs well on training data
- Doesn't perform well on evaluation data
- Underfitting
- Performs poorly on training data
- Could indicated model too simple or poor data features
- Balanced ← Goal
- Example: overfitting, underfitting and balanced fit in regression models
Model Bias
- 🔧 Difference or error between predicted and actual value
- High bias → model doesn't closely match training data → underfitting
- Reduce by:
- using a more complex model
- increasing the number of features
Model Variance
- 🔧 Difference of model performance if model is trained on a different dataset of a similar distribution
- High variance → model very sensitive to changes in training data → good performance on one training dataset, bad performance on evaluation data set → overfitting
- Reduce by:
- feature selection (less, more important features)
- splitting into training and test datasets multiple times
Summary diagram