Regularization
Definition
A set of techniques used to prevent overfitting by adding constraints or penalties to the model during training, encouraging simpler and more generalizable solutions.
Regularization prevents models from memorizing training data by imposing additional constraints. L1 regularization (Lasso) adds the absolute value of weights as a penalty, encouraging sparsity. L2 regularization (Ridge or weight decay) adds the square of weights as a penalty, discouraging large weights. Dropout randomly deactivates neurons during training, forcing the network to not rely on any single neuron. Other techniques include batch normalization, label smoothing, and stochastic depth. For large language models, common regularization approaches include dropout, weight decay, and data-based strategies like training on diverse data. The right amount of regularization balances between underfitting (too much constraint) and overfitting (too little).
Related Terms
Loss Function
A mathematical function that measures how far a model's predictions are from the actual target value...
Overfitting
When a model learns the training data too well, including its noise and peculiarities, and fails to ...
Underfitting
When a model is too simple to capture the underlying patterns in the data, resulting in poor perform...
Hyperparameter
A configuration value set before training begins that controls the training process itself, as oppos...