AI NewsClaude CodeAgentic CodingDeveloper Tools

Claude Code's 1M Context Window Is Live. Here's What Actually Changed.

JDJean Desauw

April 1, 20265 min read

Claude Code's 1M Context Window Is Live. Here's What Actually Changed.

The 1 million token context window for Claude Code went general availability on March 13. No extra charge, available immediately on Opus 4.6 and Sonnet 4.6. The previous limit was 200K tokens. Five times larger, same price.

I've been running sessions on a production codebase since. Here's what changed in daily work. And the part the announcement glosses over.

What actually improved in real sessions

The most concrete change is compaction frequency.

In long Claude Code sessions, when the agent hits its context ceiling, it compacts. It summarizes what it has been working with. Some precision gets lost. You end up re-feeding context you spent twenty minutes building up earlier in the session. Not a crash. Just a slow drift away from the specifics.

Since the 1M GA, I've seen roughly 15% fewer compaction events in sessions that previously pushed past 200K tokens. Sessions under 150K feel identical. Sessions that routinely grew past 180K: multi-file refactors, long debugging sequences on features touching several modules. The friction drops noticeably.

The other practical change: the workarounds I built up over months are less critical. Careful .claudeignore files. Targeted --include flags at session start. Manual context pruning before starting complex tasks. These still help for precision. They're no longer required to keep sessions functional.

Less friction at the ceiling. Not a transformation of how Claude Code sessions work. A relief valve that fires less often.

The pricing difference that matters at scale

OpenAI's GPT-5.4 also ships with a 1M token context window. The capability is broadly comparable. The cost model is not.

Anthropic's 1M window is flat-rate. One price from token 1 to token 1,000,000. No surcharge, no tier, no retroactive repricing past a threshold.

GPT-5.4 charges double past 272K tokens. That repricing applies to the entire request, not just the portion above the threshold. A 400K token session costs 2x the base rate for all 400K tokens, not for the 128K tokens above 272K.

For interactive Claude Code sessions, this doesn't matter. You pay a flat subscription. But if you're evaluating tools for a team, running agent pipelines in CI, or making a tooling recommendation to a CTO, the cost model is a real consideration. A pipeline running 50 sessions per day at 400K tokens average is a very different budget conversation on each platform.

Context rot: what the announcement doesn't say

Research from Chroma and Epsilla on long-context model behavior shows that reasoning quality degrades non-linearly as context grows. Not proportionally.

A session at 800K tokens doesn't perform 20% worse than a session at 200K tokens. In tasks requiring sustained multi-file reasoning or the ability to hold an architectural constraint through a long session, degradation reaches 40 to 60%. Chroma's data shows the drop isn't linear. There's a steeper cliff past certain thresholds. The model isn't reading tokens uniformly; attention degrades the further back a token sits.

The 1M window didn't change this. It changed the ceiling, not the degradation curve.

In practice: you can load a large codebase into a single session. Whether the agent reasons well over it depends on how the context is structured, not just how many tokens it contains. A raw file dump performs worse than a structured architectural summary. Broad inclusion performs worse than targeted loading.

On a production React Native project, I've consistently gotten better results from sessions that load the relevant module, the relevant conventions from CLAUDE.md, and the specific files in scope, than from sessions that load the full repository. Full-repo context works for quick search-style queries. For multi-step reasoning on a complex feature, tighter context is faster and more precise.

The 1M window is headroom. Use it when you hit the ceiling. Don't use it as a substitute for thinking about what the agent actually needs.

I go deeper on context design and session structure in the agentic coding course.

The structure work still matters

There's a temptation that comes with a bigger window: stop thinking about context and load everything. I've run sessions this way since March 13. It's convenient. It produces worse output for anything architecturally complex.

The sessions that work well in production are still the structured ones.

A CLAUDE.md that explains codebase architecture and conventions, not just the file tree. Skills that load targeted domain knowledge for specific task types. Clear scope at session start, so the agent isn't reasoning over every component when you're working on one subsystem.

A focused 80K token session with a well-built CLAUDE.md will outperform an unfocused 900K token session on most real engineering tasks. The 1M window is a ceiling, not a floor you should fill.

What I'm actually watching

The 1M window is the infrastructure story of this month. Two other developments are more interesting structurally.

Computer use landed in Claude Code on March 23 for Pro and Max users. An agent can now open a browser, navigate to a staging environment, and verify UI behavior without a test harness. I've been testing this for feature verification on a production app. It's not a replacement for scripted tests, and it fails predictably on anything repetitive. For exploratory checks: does this modal appear after a cold launch, does the navigation state persist correctly. It covers ground that a Playwright suite doesn't.

The bigger shift is multi-agent teams, currently in research preview. One lead agent, multiple specialist agents, each with its own context window and Git worktree. The compaction problem largely dissolves when agents don't share context. Each specialist starts focused on its slice. Early experiments show better output on complex multi-component work than a single long session, because each agent's context stays tight.

The 1M window extends the useful life of the current model: one agent, one growing context. Multi-agent teams make that model optional. That's the shift worth watching.

If you're seeing regular compaction in sessions past 180K tokens, the 1M GA helps immediately. If you're designing agent systems from scratch, reach for smaller agents with focused contexts. One large agent with everything loaded is the wrong model for complex work, regardless of how large the window gets.

I cover context design, session structure, and multi-agent orchestration in the agentic coding course.

First chapter free

Learn the agentic coding workflow I use in production

How I set up my repos, manage context, and run agents in production. Written down so you can do the same.

Start Chapter 1, it's free Browse all courses