Agentic Search Over Vector Embeddings NEW
Problem
Vector embeddings for code search require:
- Continuous re-indexing as code changes
- Handling local uncommitted changes
- Additional security surface area for enterprise deployments
- Infrastructure overhead (embedding models, vector databases)
- Stale indices when developers work on multiple branches
Traditional RAG approaches add complexity that may not be necessary with modern capable LLMs.
Solution
Replace vector search with agentic search using bash, grep, file traversal, and other command-line tools. Modern LLMs are skilled enough at using search tools iteratively to achieve comparable accuracy without the maintenance burden of vector indices.
Key approach:
- Tool-based search: Provide grep, ripgrep, find, ls, and other search utilities
- Iterative refinement: Let the agent search multiple times, narrowing results
- No pre-indexing: Search happens on-demand against current file state
- Optional MCP integration: If teams want semantic search, expose it via MCP tool
# Instead of:
vector_db.index(codebase) # requires continuous updates
results = vector_db.query(embedding(query))
# Use:
agent.call_tool("grep", pattern="function.*authenticate")
agent.call_tool("find", pattern="**/auth/*.ts")
agent.refine_search_based_on_results()
How to use it
When to use agentic search:
- Code bases with frequent changes
- Teams without dedicated vector infrastructure
- Security-sensitive deployments (fewer external dependencies)
- Local development where files change constantly
- Multi-branch workflows
Implementation:
- Provide comprehensive search tools (grep, ripgrep, find, fd, ast-grep)
- Give agent permission to search iteratively
- Optimize for fast tool execution rather than perfect first results
- Let agent learn search strategies through system prompts
Claude Code example:
Claude Code initially used vector embeddings but switched to pure agentic search for:
- Cleaner deployment: No indexing step, works immediately
- Local changes: Always searches current file state
- Security: Reduced attack surface for enterprise
- Accuracy: Comparable results with Sonnet 4+ models
Trade-offs
Pros:
- No indexing infrastructure to maintain
- Always searches current state (no stale results)
- Works with local uncommitted changes
- Simpler security model
- Faster setup for new repositories
- No embedding model costs
Cons:
- May require multiple search iterations (more tokens)
- Slower on very large codebases (millions of files)
- Less semantic understanding (e.g., "authentication" vs "login")
- Requires capable models (Sonnet 4+) for good results
- Higher latency for complex queries
References
- Cat Wu (Anthropic): "We did use vector embeddings initially. They're really tricky to maintain because you have to continuously re-index... Claude is really good at agentic search. You can get to the same accuracy level with agentic search and it's just a much cleaner deployment story."
- Cat Wu: "If you do want to bring semantic search to Claude Code, you can do so via an MCP tool."
- AI & I Podcast: How to Use Claude Code Like the People Who Built It