Agentic RAG
Plain Explanation
Traditional RAG fetched context once and hoped the answer was inside that single batch, which struggled on multi-hop questions. Agentic RAG puts a planner in the loop, like a careful librarian who adjusts queries, tries another catalog, reads specific passages, and only then compiles notes. The agent iterates—retrieve, read, assess gaps, re-retrieve, and validate—until evidence suffices, then composes the answer. The control policy decides when to search, which interface to use (keyword, semantic, or chunk read), and what to keep or discard before generation.
Examples & Analogies
-
Policy Q&A with evidence checks: a timeline question needs documents from different years; the agent starts broad, then narrows with exact terms and cites both sources.
-
Engineering runbook lookup: the agent reads a small chunk, detects a dependency, then retrieves a parent summary to confirm the wider procedure before drafting steps with citations.
-
Literature mini-survey: alternate semantic search for coverage and chunk reads for details, pruning irrelevant hits until notes align with scope.
At a Glance
| Traditional RAG | Active RAG | Agentic RAG | |
|---|---|---|---|
| Control policy | Fixed, one-shot | Retrieve-on-confidence in one pass | Explicit planner with multi-step control |
| Iterations | None | Local triggers | Global, multi-turn loops |
| Tool use | Single retriever | Single retriever | Multiple tools (keyword, semantic, chunk read) |
| Context construction | Static top-k | Injected during generation | Dynamic select, discard, re-retrieve |
| Termination | After one fetch | End of generation | Policy decides when to stop |
Agentic RAG separates planning from generation so the system can iterate, switch tools, and stop based on evidence sufficiency rather than a fixed pipeline.
Where and Why It Matters
-
Complex questions: if the first retrieval misses evidence, the planner can reformulate the query, switch retrieval tools, or inspect narrower chunks.
-
Token control: the system can read selectively instead of stuffing every top-k result into the prompt, keeping context smaller and cleaner.
-
Internal knowledge bases: policy, code, support, and research corpora often have hierarchy, so stepwise section-to-chunk reading is more reliable than one flat search.
-
Operational debugging: logged trajectories show whether a failure came from bad retrieval, wrong tool choice, weak reranking, or a verifier that stopped too early.
Common Misconceptions
-
❌ "Agentic RAG means more retrieval." → ✅ It means policy-driven retrieval that can also discard, switch tools, or stop early.
-
❌ "Active RAG and Agentic RAG are the same." → ✅ Active RAG triggers retrieval within one pass; Agentic RAG uses a multi-step planner.
-
❌ "Agents remove the need for verification." → ✅ Research emphasizes step-level checks and verifiers to control risks.
How It Sounds in Conversation
-
"Promote this to Agentic RAG so the planner can re-query when the first pass looks thin."
-
"Expose keyword, semantic, and chunk-read tools, but cap the loop at three retrieval steps."
-
"The failure was a tool-selection issue, not a generation issue; it should have narrowed lexically before semantic search."
-
"The answer is correct, but one sentence is not covered by evidence. Add an evidence-coverage check before generation."
-
"Log process scores so we can see whether failures come from query rewriting, chunk reading, or early termination.
Related Reading
References
- A-RAG: Scaling Agentic Retrieval-Augmented Generation via Hierarchical Retrieval Interfaces
Introduces agent-exposed keyword, semantic, and chunk tools; shows gains with efficient retrieved tokens.
- SoK: Agentic Retrieval-Augmented Generation (RAG): Taxonomy, Architectures, Evaluation, and Research Directions
Formalizes Agentic RAG as sequential decision-making; details taxonomy, risks, and evaluation.
- RAG-Gym: Optimizing Reasoning and Search Agents with Process Supervision
Framework for step-level supervision; reports sizable gains and verifier transferability.
- What is Agentic RAG?
Plain-language overview of how agents add adaptability to RAG pipelines.
- All You Need to Know About Chunking in Agentic RAG
Cost-aware, hierarchical retrieval and reranking notes with example token budgets.
- What Is Agentic RAG? From LLM RAG to AI Agents
Explains single vs multi-agent designs and iterative retrieval with validation.