Hybrid BM25 + embedding search — Glossary

BM25 is precise but brittle — it cannot recover a result whose wording differs from the query. Embeddings are robust to phrasing but can miss exact identifier matches. Hybrid retrieval runs both and merges scores, which is why production agentic search systems converge on this pattern rather than picking one.

Anthropic’s own guidance for Claude Code leans a different way — agentic search beats semantic search: an agent iteratively grepping and reading outperforms one-shot embedding retrieval on code, because the agent refines its query from what it finds. We treat the two as complementary rather than competing. Agentic search is the default inside a session; the hybrid index is what the agent’s search tool hits when the corpus is too large to walk — cross-repo lookups, documentation, past incident reports. The 2025–2026 agentic-IR literature (SpIDER, AgentIR-4B) lands on the same architecture: an agent in the loop, backed by a hybrid retriever, beats either pure approach on multi-hop code questions.

In our stack the hybrid index sits behind a single MCP search tool: BM25 from a Lucene-style index, dense vectors from a code-tuned embedding model, reciprocal-rank fusion to merge the two rankings.

When to use

Agent-facing search over a large code or document corpus where both identifier-grade lookups and concept-grade lookups are valid queries.
MCP servers exposing a single search tool to the agent — hybrid keeps the contract simple while serving both query styles.

Compare with

Pure-embedding search saves the BM25 index but loses precision on identifier queries. Pure-BM25 saves the embedding cost but loses recall on conceptual queries. Hybrid pays both costs and avoids both failure modes.

When to use

Compare with

Related terms