Claude Code Rate Limits 2026: Every Cap, What Burns Tokens Fast, and How to Stop Hitting Walls

LIMITS AT A GLANCE — JUNE 2026

● Free: No Claude Code access (confirmed June 2026)

● Pro $20/mo: ~45 prompts per 5-hour window (rolling). Hits wall in 15-30 min of heavy agentic use

● Max 5x $100/mo: ~225 prompts per 5-hour window. Most developers never hit this in normal use

● Max 20x $200/mo: ~900 prompts per 5-hour window. Effectively uncapped for individuals

● Agent SDK (since June 15): Separate credit pool — $20/Pro, $100/Max 5x, $200/Max 20x

● Shared pool: Claude Code, Claude.ai chat, and Cowork all draw from the same quota

● Reset: Rolling 5-hour window from your first prompt. Weekly cap resets Sunday midnight PT

The Three-Layer Limit System — Why Your Dashboard Lies

Claude Code does not have one limit — it has three stacked limits that operate independently. Hitting any one of them blocks you, but your usage dashboard only shows Layer 3. This is why developers see their dashboard at 6% and still get rate-limited.

Layer 1 — Tokens Per Minute (TPM)

A per-minute ceiling on total tokens processed. Claude Code sessions with large file contexts can hit the TPM ceiling even with low overall daily usage. Each file read, bash command, and tool call generates tokens beyond what you see typed. A single refactor of a 2,000-line file can spike TPM. You get a 429 error, wait 30-60 seconds, and continue — but the interruption breaks flow.

Layer 2 — Requests Per Minute (RPM)

A cap on how many API calls per minute your account makes. Agentic Claude Code sessions make many rapid tool calls in sequence — file reads, code edits, bash runs, searches. Each is a separate API request. A single autonomous session can hit RPM limits within minutes even on Max plans, causing brief pauses between tool executions.

Layer 3 — 5-Hour Rolling Window + Weekly Cap (what your dashboard shows)

The most visible limit. A rolling 5-hour window starting from your first prompt. Pro gets approximately 45 prompts, Max 5x approximately 225, Max 20x approximately 900. These are approximate because Anthropic does not publish exact token quotas — only multipliers. The window resets 5 hours after the prompt that consumed the first quota. A weekly cap also applies: hit it and you wait until Sunday midnight PT.

Limits by Plan — The Real Numbers

Plan	Price	~Prompts / 5-hr window	Hits wall when...	Agent SDK pool
Free	$0	No Claude Code	—	—
Pro	$20/mo	~45 prompts	15-30 min heavy agentic use	$20/mo credits
Max 5x	$100/mo	~225 prompts	Heavy all-day autonomous runs	$100/mo credits
Max 20x	$200/mo	~900 prompts	Rare — most individuals never hit this	$200/mo credits
API (pay-as-you-go)	Per token	No rolling window	RPM/TPM only (no pause-and-wait)	~$15K/yr at heavy use

Anthropic does not publish exact token quotas — only multipliers. The prompt counts above are community-derived estimates from r/ClaudeCode and r/ClaudeAI reports (March-June 2026). Actual limits depend on prompt length, context size, model choice (Opus burns faster than Sonnet), and tool calls.

The Critical June 15 Change — Agent SDK Split

On June 15, 2026, Anthropic split programmatic Claude Code usage into its own monthly credit pool. This affects you if you run Claude Code via the Agent SDK, claude -p scripts, GitHub Actions, or third-party agentic apps. Interactive terminal use is unaffected.

Before June 15: All Claude Code usage — interactive and programmatic — drew from the same 5-hour rolling pool.
After June 15: Interactive terminal usage keeps the rolling window. Programmatic usage (SDK, scripts, CI/CD) draws from a separate monthly credit pool: $20 on Pro, $100 on Max 5x, $200 on Max 20x. Overages bill at API list rates. If your GitHub Actions pipeline runs Claude Code nightly, check your usage now.

What Burns Tokens Fastest — The Agentic Multiplier

Claude Code does not send a single prompt and wait. Each session is a multi-turn conversation that includes the system prompt, accumulated conversation history, file contents pulled into context, and tool-use tokens from every file read, bash execution, and code edit. Community data consistently shows agentic tasks consuming 3-5x more tokens than equivalent chat usage.

Highest burn rate: Claude Opus model selection (3-5x vs Sonnet). Large file context (reading entire directories). Repeated correction iterations. Running multiple terminal sessions simultaneously (they compete, not pool).

Medium burn rate: Multi-file refactors. Long conversation history without resets. Test generation across large codebases. Any task involving codebase-wide search.

Lowest burn rate: Single-file edits with tight scope. Starting fresh conversations. Using Sonnet instead of Opus. Documentation tasks with small context. Code review on isolated functions.

7 Proven Workarounds — In Order of Impact

1. Create a .claudeignore file

The single highest-impact workaround. Exclude everything Claude Code does not need to read. node_modules/, dist/, build/, .git/, *.lock, *.log, coverage/, .next/ Prevents Claude Code from loading irrelevant files into context and saves significant quota per session. Takes 2 minutes to set up, saves 20-40% of token consumption immediately.

2. Scope every task explicitly

Write: "Refactor only src/auth/login.ts and src/auth/session.ts. Do not touch any other files." Tight scoping reduces the number of files Claude Code reads and the number of correction iterations needed. Unscoped tasks invite Claude Code to explore the codebase — each exploration costs tokens.

3. Use Sonnet for routine tasks, Opus only for hard ones

Opus burns 3-5x faster than Sonnet. Route test generation, documentation, boilerplate, and simple fixes to Sonnet. Reserve Opus for architecture decisions, complex multi-file refactors, and debugging chains where reasoning depth genuinely matters. This alone can double your effective session time on Max plans.

4. Start fresh conversations after 20 turns

Conversation history accumulates in every context window. A 50-turn conversation carries 50x the history overhead of a 1-turn conversation. After major task completion, start a new conversation. Use Projects to maintain file context without carrying conversation history. This is the most underused workaround for Pro plan users.

5. Schedule heavy tasks outside peak hours (if applicable)

Anthropic removed peak-hour throttling for Pro and Max on May 6, 2026 — so this matters less than it did in March. But API users still see higher RPM/TPM limits during off-peak hours (evenings and weekends PT). If you run overnight CI/CD pipelines, scheduling during non-peak hours reduces the chance of 429 errors.

6. Plan your heaviest work at the start of a window

The rolling window resets 5 hours from your first prompt — so the clock starts the moment you send anything. Do your largest refactors, test suites, and autonomous runs at the start of a fresh window. Leave documentation, code review, and light edits for the end of a window. Never start a major autonomous run at hour 4.

7. For Pro users: try Projects + batched prompts before upgrading

Using Claude Code inside a Project maintains file context without rebuilding it each conversation. Batching multiple related requests into a single well-structured prompt reduces the number of round-trips and the accumulated history overhead. Community reports suggest this combination solves the limit problem for approximately 60% of Pro users who were considering upgrading to Max.

When to Upgrade vs When to Switch to API

Stay on Pro if:

You hit limits 1-3 times per week max. Your workflow is episodic (focused session, break, session). You are not running overnight autonomous agents. Try the .claudeignore fix and conversation resets first — 60% of Pro users solve their limit problem this way.

Upgrade to Max 5x ($100) if:

You hit limits daily during normal working sessions. You are a full-time engineer using Claude Code as a primary tool for 4+ hours per day. You are running multi-file autonomous sessions regularly. This removes 90%+ of rate limit friction for individual developers.

Switch to API if:

You need guaranteed throughput for CI/CD pipelines. Your combined team spending exceeds $400/month. You need custom rate limits negotiated with Anthropic Enterprise. Note: at heavy agentic volumes, API billing runs $15,000+/year — significantly more than Max 20x. Test API billing with $20 of credits before committing.

Frequently Asked Questions

How long until Claude Code rate limits reset?

The 5-hour window resets on a rolling basis — 5 hours from the first prompt that consumed quota. If you hit the limit at 10am, your window resets at 3pm. Most users see usage restored within 2-4 hours because the window started before they hit the cap. There is no hard daily reset at midnight.

Why does my usage jump to 100% on a single prompt?

This is the Layer 1/2 vs Layer 3 disconnect. Your dashboard shows the weekly Layer 3 quota. A single large Opus prompt on an extensive codebase can hit the per-minute TPM ceiling (Layer 1), which appears as immediate 100% usage because the dashboard mislabels the error. Start a new conversation to reset the context size and the TPM spike typically resolves.

Does running multiple terminal sessions help avoid limits?

No — all Claude Code sessions are tied to your account. Multiple terminals running simultaneously consume quota faster, not slower. Each session competes for the same pool. Running 3 terminals triples your token burn rate.

Is there a way to see how much usage I have left?

Not directly — Claude Code does not expose a real-time usage meter. Your dashboard shows the weekly Layer 3 quota percentage, which does not reflect Layer 1 or Layer 2 limits. You find out you have hit the limit when the error message appears. The only workaround is manual tracking: some teams write proxy scripts that log token consumption per session against the Rate Limits API endpoint.

Claude Code Rate Limits 2026 — Every Cap, What Burns Tokens Fast, and How to Stop Hitting Walls