Overfitting
Definition
When a model learns the training data too well, including its noise and peculiarities, and fails to generalize to new, unseen data.
Overfitting is one of the most fundamental challenges in machine learning. An overfit model achieves excellent performance on training data but poor performance on test or real-world data because it has memorized specific examples rather than learning generalizable patterns. Signs of overfitting include a large gap between training accuracy and validation accuracy. Common prevention techniques include regularization (L1/L2), dropout, early stopping, data augmentation, and using more training data. Interestingly, very large neural networks sometimes exhibit "double descent" where performance initially worsens with more parameters (overfitting) but then improves again at very large scale. Understanding and managing overfitting remains a critical skill in applied machine learning.
Related Terms
Regularization
A set of techniques used to prevent overfitting by adding constraints or penalties to the model duri...
Data Augmentation
Techniques for artificially expanding training datasets by creating modified versions of existing da...
Underfitting
When a model is too simple to capture the underlying patterns in the data, resulting in poor perform...
Training Data
The dataset used to teach a machine learning model, consisting of examples from which the model lear...