Safety

Jailbreak

Definition

A prompt engineering technique designed to bypass the safety guardrails of AI models, causing them to produce outputs they were trained to refuse. Jailbreaks exploit gaps between training objectives and actual model behavior. AI companies continuously patch jailbreaks while researchers discover new ones, creating an ongoing cat-and-mouse dynamic.

Related Terms

No related terms linked yet.

Explore all terms →

Explore companies in this space

Safety Companies

View Safety companies