RAG Handbook
A practical guide to building retrieval-augmented generation systems that work in production.
This guide aims to help beginners learning how to build RAG systems as well as teams deploying production ready RAG applications at scale. This guide is framework and tooling agnostic and only delves into the ideas and philosophies of a RAG system.

Most teams building with LLMs eventually need to ground the model's responses in their own data. The standard approach is Retrieval-Augmented Generation: you search your content for relevant information, inject it into the model's context, and generate an answer that's informed by what you actually know rather than what the model was trained on.
The concept is simple. The execution is where things get interesting.
This handbook walks through RAG from first principles to production operations. It covers the decisions you'll make (chunking strategies, embedding models, retrieval algorithms, reranking, prompt design), the tradeoffs behind those decisions (quality vs latency vs cost), and the failure modes you'll encounter in practice (false positives, hallucinations, stale data, permission leaks).
The goal is to help you understand RAG deeply enough to build systems that actually work, debug them when they don't, and improve them over time.
Who this is for
This handbook assumes you're comfortable reading code and have some familiarity with LLMs, but it doesn't assume prior RAG experience. If you're building your first retrieval system, start from the beginning. If you're debugging a production system that's misbehaving, jump to the module that matches your problem.
The material progresses from foundational concepts through increasingly production-focused topics. Early modules explain what RAG is and how the pieces fit together. Later modules cover evaluation, security, cost control, and operational patterns that matter when real users depend on your system.
How to use this guide
The handbook is organized as a linear progression: Module 0 through Module 8, plus an appendix of reference material. Each module builds on concepts from earlier modules, so reading in order works well if you're learning RAG from scratch.
If you're looking for specific guidance, here are two common paths:
Learning path: Start with Module 0 (Orientation) and Module 1 (Foundations) to understand the core concepts. Then work through Module 2 (Data and Ingestion) and Module 3 (Chunking) to understand how content gets into the system. Module 4 (Retrieval) covers how to get it back out. You can skim the later modules and return when you need them.
Production debugging path: If you're already running a RAG system and hitting problems, start with Module 0 to calibrate vocabulary, then jump to Module 7 (Evaluation) to set up measurement. From there, use the evaluation results to identify which module addresses your specific issues (chunking, retrieval, reranking, or generation).
About the examples
When examples appear, they're written to illustrate the general concept rather than any specific framework. When an example uses Unrag specifically, it's marked as one concrete implementation of the pattern. You should be able to apply the same ideas with any RAG stack.
Module overview
Module 0: Orientation
Understand RAG fundamentals, production architecture, use cases, and tradeoffs
Module 1: Foundations
Embeddings, chunking, metadata filtering, indexing, and score calibration
Module 2: Data and Ingestion
Content sources, pipelines, cleaning, document modeling, and updates
Module 3: Chunking
Chunk sizing, structure-aware strategies, multi-representation indexing
Module 4: Retrieval
topK and thresholds, hybrid retrieval, query rewriting, filtering, caching
Module 5: Reranking
Two-stage retrieval, cross-encoders, LLM reranking, context compression
Module 6: Generation
Grounding prompts, answer formats, chat, agents, hallucinations, UX patterns
Module 7: Evaluation
What to measure, building datasets, offline evals, online feedback
Module 8: Production
Observability, latency, cost controls, security, reliability, scaling
Appendix
Checklists, glossary, and failure modes reference
Quick navigation
New to RAG? Start with Module 0: Orientation to understand the fundamentals.
Building with Unrag? Check out the practical examples and guides alongside the handbook.
Debugging production issues? Jump to Module 7: Evaluation to set up measurement, then use results to identify which module addresses your specific problem.