Introduction

Candidate generation that actually works - topK/thresholds, hybrid retrieval, routing, filters, and debugging.

Welcome to the Retrieval Module

Retrieval is where your RAG system either finds the right information or fails silently. You can have perfectly chunked content with excellent embeddings, but if your retrieval strategy doesn't surface the right chunks for a given query, the LLM has nothing useful to work with. Retrieval is the bridge between your indexed knowledge and the generation step, and getting it right requires more than just calling a vector search.

This module covers the practical techniques for reliable retrieval. We'll move beyond basic topK queries to explore hybrid approaches, query transformation, intelligent routing, and the filtering strategies that keep your system secure. By the end, you'll have a toolkit for diagnosing and improving retrieval quality in production.

What you'll learn in this module

By the end of this module, you will understand:

How to configure topK and thresholds: Why returning the top results isn't always enough, and how to detect when retrieval didn't find anything worth using.
When to use hybrid retrieval: Combining keyword (BM25) and semantic search to catch what vectors miss.
Query transformation techniques: Rewriting and decomposing queries to improve recall on complex or ambiguous questions.
Routing strategies: Directing queries to the right index or corpus instead of searching everything.
Secure filtering: Implementing access control at the retrieval layer, not as an afterthought.
Diversity and deduplication: Avoiding redundant results and increasing coverage.
Performance optimization: Caching, index tuning, and latency management.
Systematic debugging: A repeatable process for diagnosing retrieval failures.

Introduction

Welcome to the Retrieval Module

What you'll learn in this module

Chapters in this module

topK, thresholds, and 'no good match'

Hybrid retrieval (BM25 + vectors)

Query rewriting and decomposition

Routing and multiple indexes

Filtering and ACL-safe retrieval

Diversity and deduplication

Caching and performance

Retrieval debugging playbook

Ready to begin?

Next: topK, thresholds, and 'no good match'

On this page