Claude Code 2.1 introduces MCP Tool Search, cutting token usage by 85% and boosting accuracy from 79.5% to 88.1%. A deep dive into the biggest productivity upgrade since launch.

TL;DR: Claude Code 2.1 introduces MCP Tool Search, which automatically defers tool loading when descriptions exceed 10% of context. Result: 85% token reduction, accuracy jumping from 79.5% to 88.1% on Opus 4.5. Combined with the new hooks system and skills hot reload, this is the biggest productivity upgrade since launch.
If you've connected more than a few MCP servers to Claude Code, you've hit the wall. I had 7 servers running—GitHub, Slack, memory, sequential thinking, a couple custom ones—and my context was nearly exhausted before I typed a single prompt.
The numbers are brutal:
Thariq Shihipar, who announced the feature, put it plainly: "Users were documenting setups with 7+ servers consuming 67k+ tokens."
That's context pollution at scale. And Claude Code 2.1 fixes it.
Instead of loading every tool definition upfront, Claude Code now implements lazy loading:
The feature supports two search modes:
| Mode | How It Works | Best For |
|---|---|---|
| Regex | Claude constructs patterns like weather or get_.*_data | Precise matching when you know the tool name |
| BM25 | Natural language queries with semantic matching | Exploratory searches |
Anthropic's benchmarks tell the story:
| Metric | Before | After | Improvement |
|---|---|---|---|
| Token usage (50+ tools) | ~77K | ~8.7K | 85% reduction |
| Opus 4 accuracy | 49% | 74% | +25 points |
| Opus 4.5 accuracy | 79.5% | 88.1% | +8.6 points |
This isn't incremental. It's a step change.
Version 2.1.0 shipped January 7, 2026. Version 2.1.9 followed with 109 CLI refinements. Across the 2.1.x series, the team shipped 1,096 commits. Here's what matters:
You don't need to opt in. It's enabled automatically for all users. When your MCP tool descriptions exceed the 10% threshold, Claude Code switches to search-based discovery.
If you want to force it on earlier or adjust behavior:
ENABLE_TOOL_SEARCH=true
Hooks let you inject custom logic at key points in Claude's workflow:
| Hook | When It Runs | Use Cases |
|---|---|---|
| PreToolUse | Before Claude executes any tool | Validation, blocking dangerous operations, input modification |
| PostToolUse | After Claude completes a tool | Cleanup, formatting, running tests |
| Stop | When Claude finishes responding | Final validation, logging |
| SessionStart | When a session begins | Environment setup |
| UserPromptSubmit | When you submit a prompt | Input preprocessing |
The power move: PreToolUse hooks can modify tool inputs before execution (starting in v2.0.10). Instead of blocking Claude and forcing retries, you intercept and correct.
Example use cases:
--dry-run flags to dangerous commandsPreviously, creating or updating a skill required restarting your session. Now:
~/.claude/skills or .claude/skills are immediately available without restartpackages/frontend/app.tsx, Claude finds skills in packages/frontend/.claude/skills/This enables real monorepo workflows where each package can have its own specialized skills.
Bash(*-h*) grants help flag access across all commands/teleport to Claude.ai: Move your session to the web interface mid-conversationCtrl+B to background long-running operationsMCP Tool Search isn't just a performance optimization. It changes how you can architect AI systems.
You had to choose which MCP servers to connect, knowing each one consumed context. Complex setups meant tradeoffs.
With lazy loading, you can connect every server you might need. Claude discovers what's relevant. The constraint shifts from "what can I afford to load" to "what might be useful."
For enterprise deployments, this means:
PreToolUse hooks with input modification enable:
This is how you build guardrails for AI coding assistants without breaking the developer experience.
Fair question. Third-party testing shows mixed results at extreme scale:
| Scenario | Tool Search Accuracy |
|---|---|
| Anthropic internal (typical use) | 74-88% |
| 4,027 tools (Arcade testing) | 56-64% |
| Thousands of tools (Stacklok comparison) | ~30-34% |
The takeaway: MCP Tool Search works excellently for typical setups (5-50 tools). At thousands of tools, you'll want dedicated optimization like Stacklok's MCP Optimizer.
For most developers and architects, the default behavior is a massive win.
claude --version
# Should be 2.1.x or higher
npm install -g @anthropic-ai/claude-code@latest
# or
claude update
claude mcp list
Create ~/.claude/settings.json:
{
"hooks": {
"PreToolUse": {
"Bash": {
"command": "your-validation-script.sh"
}
},
"PostToolUse": {
"Write": {
"command": "prettier --write $FILE_PATH"
}
}
}
}
Yes. As of January 2026, it's on for all users automatically. It triggers when tool descriptions exceed 10% of context.
Yes. It works with any MCP server. The search happens client-side in Claude Code.
You can, but why would you? If you have specific needs, check Claude Code settings.
Regex is precise pattern matching (get_.*_data). BM25 is semantic search with natural language queries. Claude picks the appropriate mode automatically.
The default hook timeout changed from 60 seconds to 10 minutes in 2.1.x, but well-written hooks add negligible overhead. Keep them fast.
Yes. Works on macOS, Linux, and Windows. Skills are discovered from both global (~/.claude/skills) and project-local (.claude/skills) directories.
The constraint on MCP tool usage just disappeared. Build accordingly.
Have questions about implementing MCP Tool Search or hooks in your workflow? Check the AI Architecture Hub for blueprints and working examples.
Sources:
Read on FrankX.AI — AI Architecture, Music & Creator Intelligence
Join 1,000+ creators and architects receiving weekly field notes on AI systems, production patterns, and builder strategy.
No spam. Unsubscribe anytime.