Back to GlossaryApplications

RAG (Retrieval-Augmented Generation)

Definition

A technique that enhances LLM responses by first retrieving relevant documents from an external knowledge base and including them in the prompt as context.

RAG addresses two major LLM limitations: outdated training data and hallucination. Instead of relying solely on the model's parametric knowledge, RAG systems retrieve relevant documents from a vector database or search index and inject them into the prompt, grounding the model's response in specific sources. A typical RAG pipeline involves: (1) converting documents into embeddings and storing them in a vector database, (2) embedding the user's query and finding the most similar documents, (3) including retrieved documents in the LLM prompt, and (4) generating a response with citations. RAG is the most popular approach for building enterprise AI applications because it provides up-to-date information, reduces hallucination, and allows companies to leverage their proprietary data without expensive fine-tuning.

Companies in Applications

View Applications companies →