Back to GlossaryCore Concepts

Large Language Model

Definition

A neural network with billions of parameters trained on massive text datasets, capable of understanding and generating human language with remarkable fluency.

Large Language Models (LLMs) are the driving force behind the current AI revolution. Models like GPT-4, Claude, Gemini, and Llama are trained on trillions of tokens of text using the transformer architecture. They predict the next token in a sequence, and through this simple objective, they develop sophisticated capabilities including reasoning, coding, translation, and creative writing. LLMs are typically pre-trained on general text and then fine-tuned with instruction tuning and RLHF to follow user instructions safely. The scale of these models ranges from billions to potentially trillions of parameters, requiring massive GPU clusters for training. LLMs have become the foundation for chatbots, coding assistants, search engines, and AI agents.

Companies in Core Concepts

View Core Concepts companies →