Anthropic Explains Why Claude Code Is Burning Through Your Limits So Fast

Anthropic has addressed widespread complaints about Claude Code's rapid usage drain — citing peak-hour rate caps and sessions with contexts exceeding one million tokens as the primary culprits, alongside a recommendation to switch from Opus to Sonnet 4.6.

D.O.T.S AI Newsroom

AI News Desk

Apr 4, 20262 min read

Anthropic Explains Why Claude Code Is Burning Through Your Limits So Fast

Anthropic has publicly addressed one of the most common complaints from Claude Code subscribers: usage limits that appear to drain far faster than expected. The explanation, delivered via Anthropic engineer Lydia Hallie, names two primary causes — and it's worth understanding both if you're a Claude Code power user.

Cause 1: Peak-Hour Rate Caps

Anthropic applies tighter rate limits during peak usage hours to maintain service quality across its user base. This means that the same Claude Code session that runs smoothly at 10 PM may hit limits more quickly during afternoon US working hours, when demand on the infrastructure is at its highest.

This is a standard cloud infrastructure practice, but it creates a frustrating user experience when the limits aren't clearly surfaced in real time. If your Claude Code sessions feel slower or more throttled at certain times of day, this is the mechanism behind that pattern.

Cause 2: Context Windows That Grow to 1M+ Tokens

The second cause is more directly user-controllable: sessions where the context window grows to one million tokens or larger. Claude Code accumulates context across a session — every file read, every tool call, every prior turn gets appended to the running context. In long coding sessions with large codebases, this context can balloon to sizes that consume disproportionate compute resources, even when the visible conversation appears modest.

Hallie's recommendations are concrete: use Sonnet 4.6 instead of Opus where possible (Opus burns through limits roughly twice as fast), disable Extended Thinking when it's not needed for the specific task, start fresh sessions instead of extending old ones, and explicitly limit the context window size.

Bug Fixes and What They Didn't Cover

Anthropic also reports shipping several bug fixes related to the usage tracking system. Importantly, Hallie specified that none of the bugs identified led to incorrect billing — they affected how usage was surfaced to users, not what users were actually charged.

In-product pop-ups have been added to give users earlier warning before hitting limits. Users still experiencing anomalous usage drain after applying the recommended settings are directed to submit feedback through Claude Code's built-in feedback function.

The Underlying Tension

This episode highlights a structural tension in agentic AI products: the most powerful use cases (long multi-file coding sessions, complex reasoning chains) are precisely the ones that consume the most resources. Flat-rate subscription pricing becomes increasingly difficult to sustain as usage patterns shift toward compute-intensive agentic workflows. Anthropic hasn't changed its pricing model, but the pressure is visible.

Back to Home

Anthropic Explains Why Claude Code Is Burning Through Your Limits So Fast

Cause 1: Peak-Hour Rate Caps

Cause 2: Context Windows That Grow to 1M+ Tokens

Bug Fixes and What They Didn't Cover

The Underlying Tension

Related Stories

Astropad's Workbench Turns a Mac Mini Into an AI Agent Server You Control From Your Phone

Microsoft's Bing Team Open-Sources Harrier, a Multilingual Embedding Model That Tops the MTEB v2 Benchmark

Stability AI Pivots to Enterprise With Brand Studio — a Platform for Brand-Consistent AI Image Generation