Claude Code v2.1.89: PreToolUse Defer, 300K max_tokens GA, and 20+ Fixes That Actually Matter

The Headline Numbers

Claude Code v2.1.89 ships 20+ changes. The ones that matter:

"defer" decision in PreToolUse hooks — hooks can now pause and re-evaluate
PermissionDenied hook notifications — know when auto-approval was blocked
CJK/emoji history bug fixed — 4KB limit causing history corruption in Asian-language codebases
autocompact infinite loop fixed — sessions with large context were occasionally spinning
300K max_tokens GA — Opus 4.6 and Sonnet 4.6 via Message Batches API
Fine-grained tool streaming GA — streaming partial tool results as they arrive
Structured outputs GA — reliable JSON schema enforcement in responses

Hook System: What `"defer"` Actually Enables

The existing PreToolUse hook had two options: "allow" (proceed) or "block" (stop). The new "defer" option adds a third: pause and re-evaluate with additional context.

Practical use case: you have a hook that checks whether a file edit is safe. Previously, you had to make that decision with only the tool call parameters. With "defer", your hook can:

Receive the tool call parameters
Fetch additional context (check git status, read dependent files, query your own policy service)
Return a final "allow" or "block" with full context

This is particularly valuable for agentic workflows where the safety of an action depends on accumulated state — what files have already been modified, what the current test status is, whether a deployment is in progress.

The PermissionDenied hook complements this: you now get a callback when an auto-approval is blocked, allowing your orchestration layer to detect and handle blocked actions rather than silently failing.

The CJK/Emoji History Bug: More Important Than It Sounds

The 4KB history corruption bug affected any Claude Code session using:

Chinese, Japanese, or Korean characters in code or comments
Emoji in commit messages, variable names, or output
Multi-byte UTF-8 sequences in filenames or strings

The bug caused history entries to get truncated at byte boundaries rather than character boundaries, which could corrupt the serialized history. In multi-session workflows, corrupted history meant lost context and unpredictable behavior.

If you work with CJK codebases (including common Korean startup tech stacks) or use emoji extensively — update immediately. This is a correctness fix, not just a cosmetic improvement.

300K max_tokens: What You Can Actually Do With It

Message Batches API now supports 300,000 max_tokens per request with Opus 4.6 and Sonnet 4.6. To put this in practical terms:

Task	Approximate Token Usage
Full Next.js project (typical)	50,000–100,000 tokens
Large monorepo service	100,000–200,000 tokens
Full codebase review + recommendations	200,000–300,000 tokens
Multi-file refactoring with tests	100,000–150,000 tokens

Single-call workflows that previously required chunking and multi-step orchestration can now run as atomic operations. For batch processing pipelines — nightly code reviews, automated documentation generation, large-scale analysis — this dramatically simplifies architecture.

The catch: 300K token calls are expensive in absolute terms. This is a tool for high-value batch workloads, not interactive use.

Fine-Grained Tool Streaming and Structured Outputs

Tool streaming: Previously, tool results were returned as a complete block when the tool finished. Now, streaming partial tool results as they arrive is supported. For tools that return large outputs (search results, file contents, API responses), this means the model can begin processing early results while the tool is still running.

Structured outputs: JSON schema enforcement is now GA. If you specify an output schema, the API guarantees the response matches it — no more post-processing to handle format violations or adding retry logic for malformed JSON. This was technically possible before via prompt engineering; now it’s a first-class API feature with hard guarantees.

Upgrade Path

# Claude Code CLI update
claude update

# Verify version
claude --version
# Should show 2.1.89 or later

If you’re using hooks, review your PreToolUse hook implementations to see if any could benefit from the defer pattern. The PermissionDenied hook is worth adding to any production automation that uses auto-approval — it gives you observability into approval failures that were previously invisible.

The Bigger Picture

This release is largely infrastructure and stability — hooks becoming more expressive, bugs fixed, API capabilities graduating to GA. The pattern Anthropic is following: land features in beta, gather production feedback, ship GA when they’re stable.

Structured outputs and fine-grained streaming graduating to GA in the same release as the 300K token expansion suggests the focus for H1 2026 is batch processing and automation reliability rather than new interactive features.

For developers building on Claude Code: the platform is getting more reliable and more capable of handling production workloads. The tooling for production-grade automation is clearly the current investment priority.

Full changelog: Claude Code Changelog | Platform Release Notes