The constraint behind everything

If you take one concept from this entire chapter, let it be this one. The context window is Claude's most important resource, and the one most people ignore until it is too late.

What the context window actually is

The context window is Claude's working memory. Everything Claude knows about your current session lives in this buffer: the system prompt, your CLAUDE.md, every file it has read, every command it has run, every message you have sent, every response it has generated, and all its internal reasoning.

The default context window is 200,000 tokens. You can opt in to 1 million tokens by adding the [1m] suffix to your model in settings, for example "model": "claude-opus-4-6[1m]". If you followed the setup from Lesson 2, you already have this. Either way, the budget goes faster than you think.

Think of the context window as RAM, not storage. Everything in it is active, available, and consuming space. There is no "disk" to swap to, no way to page things in and out. When the window fills up, something has to go.

What consumes context

What fills your context window in a typical session, roughly in order of cost:

Loaded at session start (before you type anything):

System prompt and built-in tool descriptions
Your CLAUDE.md file (and any project-level CLAUDE.md files)
Skill descriptions (every installed skill adds a summary, even if not actively used)
MCP server tool descriptions (each connected server adds its tool list)
Initial session context

This is why the "don't install everything" warning from Lesson 2 matters here. Every skill and every MCP server costs tokens just by being connected, before you type a single message. Run /context to see exactly what is loaded and how much space it takes. You might be surprised how much of your budget is already spoken for.

Accumulated during the session:

Every file Claude reads (full content of each file)
Every Bash command output (build logs, test results, git diffs)
Every message you send
Every response Claude generates
Extended thinking (internal reasoning tokens, these count)
Web search and fetch results

Rough token costs

CLAUDE.md (well-structured)     ~800-1500 tokens
Plugin tool descriptions        ~2000-4000 tokens
Reading a 200-line file         ~1000-1500 tokens
Reading a 500-line file         ~3000-4000 tokens
Build output (success)          ~200-500 tokens
Build output (failure + errors) ~1000-3000 tokens
A typical Claude response       ~500-2000 tokens
Extended thinking per turn      ~2000-10000 tokens

Add it up. Ten file reads, five bash commands, a few back-and-forth messages, and you are at 50,000-80,000 tokens. That is a quarter of your budget, and you might not even be halfway through the task.

200K default~80K used in 30 min

Ten file reads + five bash commands + a few messages = almost half your budget gone

What happens when it fills up

When the context window approaches capacity, Claude Code triggers auto-compaction. It is a summarization step, not a crash or an error. Claude takes the older parts of the conversation and compresses them into a summary, freeing space for new work.

The problem is that summaries lose detail. That function signature Claude read 20 minutes ago? After compaction, it might remember "read a function related to authentication" but not the exact parameter types. That build error from earlier? Summarized to "there was a type error that was fixed."

You have seen this yourself even if you did not know the cause. Claude works great for the first 15 minutes, then starts making small mistakes: using the wrong variable name, forgetting a pattern it followed earlier, suggesting an approach it already tried and rejected. Context degradation.

This is why /compact from the quick wins matters, and why /clear between tasks matters even more. You are not just "cleaning up." You are protecting the quality of Claude's reasoning.

The cascade effect

Context management matters because every decision compounds.

A bloated setup (too many plugins loaded, a CLAUDE.md that is too long) means less room for file reads. Fewer file reads means Claude gathers less context about your codebase. Less context means worse decisions. Worse decisions mean more verification failures. More failures mean more loop iterations. More iterations consume more context. The window fills faster. Compaction happens sooner. Information is lost. Quality drops further.

Why quick wins are not enough

In the previous lesson, you learned eight things you can do today to improve your results. Every one of them helps. But there is a difference between knowing the tips and understanding the system.

Knowing to use /compact is a quick win. Knowing when to compact (after research but before implementation, after a failed approach, at natural workflow breakpoints) is methodology. Compacting at the wrong time can lose context you still need.

Knowing to use Plan Mode is a quick win. Knowing how to write specs that make Plan Mode reliable (with acceptance criteria, constraints, and the right level of detail) is methodology. Plan Mode with a vague prompt produces a vague plan.

Knowing to give verification criteria is a quick win. Building a verification pipeline that runs automatically after every change (build, types, lint, tests, security checks) is methodology. You stop remembering to verify and start building systems that verify for you.

Chapters 2 through 7 each give you a different lever for managing this resource. Together they form a system.

The course map

Every chapter that follows teaches techniques that are, at their core, context management strategies:

Chapter 2, CLAUDE.md: Front-load the right information so Claude does not have to go searching for it. Your project's rules, boundaries, and patterns loaded once, used every session.
Chapter 3, The human in the loop: Give Claude structured intent so it gathers the right context and acts with precision. Specs, Plan Mode, course-correcting.
Chapter 4, Context budget management: Load specialized knowledge only when needed. Isolate expensive research tasks. Route to the right model for the job.
Chapter 5, The feedback loop: Capture friction as it happens, route corrections to the right place, and build hooks that enforce rules at zero context cost. Your setup gets smarter with every session.
Chapter 6, Scaling up: Run multiple Claude sessions in parallel with worktrees, script Claude for automated pipelines, use the writer/reviewer pattern for unbiased quality, and configure permissions for safe autonomy.
Chapter 7, Build your blueprint: Wire it all together into your own system. Not someone else's. Yours.

All of these answer the same question: how do we get the most value out of our context budget?

The system starts here

Chapter 1 gave you the map. You understand the machine (Lesson 1), you have the quick wins (Lesson 2), you saw what the 1M context window does and does not change (Lesson 3), and you understand the resource that every technique in this course is designed to protect (this lesson).

What you do not yet have is the system.

The developers who finish this course do not write better prompts. They build better systems. They have CLAUDE.md files that front-load the right context. Skills that load domain knowledge on demand. Hooks that enforce rules without consuming a single token. Subagents that research in isolation. Verification pipelines that catch mistakes automatically.

And the system starts with one file: CLAUDE.md.

Chapter 2 begins where every session begins: with your codebase.