Loss Function
Definition
A mathematical function that measures how far a model's predictions are from the actual target values, providing the signal that drives the training process.
The loss function (also called cost function or objective function) quantifies the model's error and is the quantity that training seeks to minimize. Common loss functions include cross-entropy loss for classification tasks, mean squared error for regression, and various specialized losses for generative models. For language models, the standard loss is cross-entropy over the predicted token distribution versus the actual next token. The choice of loss function significantly impacts what the model learns — for example, using perceptual loss instead of pixel-wise loss for image generation produces more visually pleasing results. Loss values are tracked during training to monitor convergence and detect issues like overfitting or training instability.
Related Terms
Backpropagation
The fundamental algorithm for training neural networks, which computes the gradient of the loss func...
Gradient Descent
An optimization algorithm that iteratively adjusts model parameters in the direction that most reduc...
Overfitting
When a model learns the training data too well, including its noise and peculiarities, and fails to ...
Perplexity
A metric that measures how well a language model predicts text, calculated as the exponential of the...