RRF
Reciprocal Rank Fusion
Plain Explanation
RRF (Reciprocal Rank Fusion) merges several search result lists into one final ranking. The important trick is that it uses ranks, not raw scores. BM25, vector search, and other retrievers score documents on different scales, so adding their scores directly can be misleading. RRF gives each document a small contribution 1/(k + rank) from each list and adds those contributions. Documents that repeatedly appear near the top move upward.
Examples & Analogies
- Multiple judge scorecards: each judge ranks candidates differently, but candidates that consistently appear near the top across judges rise in the final ranking. RRF does the same for retrievers.
- Hybrid help-center search: run BM25 and vector search in parallel, fuse with RRF to form a top-100 candidate pool, then let a Cross-Encoder pick the final top-10.
- RAG-Fusion with query rewrites: retrieve for several paraphrases of the same question, then use RRF so documents that survive across views rise above one-off matches.
At a Glance
| RRF | Cross-Encoder reranker | Single retriever | |
|---|---|---|---|
| Role | Fast fusion for candidate recall | Precise scoring for a small pool | Simple baseline |
| Input | Result ranks | Query and document text | Retriever-native score |
| Strength | No score calibration needed | Strong precision | Easy to operate |
| Limitation | Needs useful candidate lists | High cost and latency | Recall-limited |
Where and Why It Matters
- Hybrid search default: safely combines retrievers with incompatible score scales.
- RAG candidate selection: widens recall before an expensive reranker inspects fewer candidates.
- Operational simplicity: gives a strong baseline without score-normalization experiments.
- Observability: track Recall@K, list overlap, reranker input size, and p95 latency together.
Common Misconceptions
- ❌ Myth: RRF averages BM25 and vector scores → ✅ Reality: it ignores raw scores and sums
1/(k + rank). - ❌ Myth: smaller k is always better → ✅ Reality: k controls the balance between one list's top hit and cross-list consensus.
- ❌ Myth: RRF removes the need for reranking → ✅ Reality: RRF is usually candidate selection; a reranker still improves final precision.
How It Sounds in Conversation
- "Fuse BM25 and vector with RRF (k=60), then send only top-100 to the Cross-Encoder."
- "Low overlap means RRF cannot find much consensus, so let's improve query rewrites or the first-stage retrievers."
- "Canonicalize doc_ids and drop duplicates before fusion so the same document is not double-counted."
Related Reading
References
- Reciprocal Rank Fusion outperforms Condorcet and individual Rank Learning Methods
Original RRF paper describing the formula and rank-fusion behavior across IR systems.
- SearchGym: Cross-Platform Benchmarking and Hybrid Search Orchestration
Benchmarking and orchestration context for hybrid-search fusion strategies.
- What is hybrid search?
Elastic documentation explaining RRF and linear combination in hybrid retrieval.
- Qdrant Hybrid Queries
Qdrant documentation showing RRF-style hybrid query composition.