Back to GlossaryCore Concepts

Reinforcement Learning

Definition

A type of machine learning where an agent learns to make decisions by taking actions in an environment and receiving rewards or penalties as feedback.

Reinforcement Learning (RL) is inspired by behavioral psychology and the way humans learn through trial and error. An agent interacts with an environment, observes states, takes actions, and receives rewards. The goal is to learn a policy that maximizes cumulative reward over time. Key algorithms include Q-learning, policy gradient methods, and actor-critic approaches. RL achieved major milestones including AlphaGo defeating the world champion at Go and training robotic hands to solve Rubik's cubes. RL from Human Feedback (RLHF) has become a critical technique for aligning large language models with human preferences.

Companies in Core Concepts

View Core Concepts companies →