Accuracy
Definition
The proportion of correct predictions out of total predictions made by a model, the simplest and most intuitive classification metric.
Accuracy measures how often a model is right, calculated as (true positives + true negatives) / total predictions. While intuitive, accuracy can be misleading for imbalanced datasets — a model that always predicts "not fraud" achieves 99.9% accuracy on a dataset where 0.1% of transactions are fraudulent, despite being completely useless for fraud detection. For this reason, accuracy is typically supplemented with precision, recall, F1 score, and AUC-ROC for classification tasks. In the context of LLM evaluation, accuracy is commonly reported on multiple-choice benchmarks like MMLU, where it measures the percentage of questions answered correctly. Despite its limitations, accuracy remains the default reporting metric for many benchmarks due to its simplicity and universal understanding.
Related Terms
Precision
The proportion of positive predictions that are actually correct, measuring how reliable a model's p...
Benchmark
A standardized test or dataset used to evaluate and compare the performance of AI models on specific...
F1 Score
The harmonic mean of precision and recall, providing a single metric that balances both the complete...
Recall
The proportion of actual positive cases that the model correctly identifies, measuring how completel...