Back to Blog
Agentic Coding
Claude Code
Developer Tools
Context Window

Skills vs MCPs: Stop Burning Your Context Window

JD
Jean Desauw
5 min read
Skills vs MCPs: Stop Burning Your Context Window

If you've been adding MCPs to your Claude Code setup the way most people add npm packages (one for GitHub, one for Figma, one for your database, one because the README looked good), you're probably wondering why your sessions feel heavier than they should.

The problem isn't the tools. It's that the skills vs mcp claude code decision is one most people never consciously make. MCPs are visible, well-documented, and easy to install. Skills stay hidden unless someone shows you they exist. So people default to MCPs and end up paying a context cost they don't know they're paying.

These two mechanisms solve different problems. Conflating them degrades every session quietly.

What MCPs Actually Cost You

When Claude Code starts a session, it loads the full schema of every MCP server you've configured. Every tool description, every parameter, every annotation. Before you type a single word.

I measured this on a setup with 5 MCP servers and 58 tools. Total token consumption at session start: roughly 55,000 tokens. On a 200K context limit, that's nearly 28% of your working space spent before the conversation begins. On older limits, it was worse.

The cost scales with server count and with how verbose the tool descriptions are. One researcher documented a single MCP server consuming 14,214 tokens, then trimmed it to 5,663 by consolidating similar tools and shortening descriptions. That's a 60% reduction just from cleanup.

Claude Code's Tool Search feature helps by deferring tool schema loading until a tool is actually called, reducing the upfront overhead by roughly 47%. But not all MCPs benefit equally, and the fixed initialization cost of the server connection itself doesn't go away.

The point isn't that MCPs are a bad idea. They're the right choice in specific situations. The point is that every server you add has a measurable cost, and most developers don't know what that cost is until the session starts behaving strangely.

What Skills Are and Why They're Different

A Claude Code skill is a Markdown file. That's the whole thing.

You create a .claude/skills/ directory in your project. Each file in that directory is a skill: a set of instructions, context, or procedures that Claude can read when invoked. Skills load as text when called. If a session never needs the code-review skill, those tokens never touch the context window.

This is the core distinction. Skills load on demand. MCPs front-load at session start.

A skill encodes knowledge. It tells Claude how to do something: how to write commit messages in your team's format, how to walk through the code-review checklist for this specific codebase, how to structure a database migration safely. Skills are procedures, conventions, and domain expertise stored as readable text.

An MCP provides access. A live GitHub repository. A running database. A Figma file. The current state of an external system that changes over time and can't be captured in a static file.

If the information is static or changes slowly, a skill is almost always more efficient. If the information is live, external, or requires action on a remote system, an MCP earns its cost.

I go deeper on context budget management and .claude/ directory structure in the agentic coding course.

The Skills vs MCP Decision Framework

Before adding anything to my Claude Code setup, I ask one question: does this need live access, or does it need knowledge?

Use an MCP when:

  • You need to read or write to a live external system: GitHub issues, Linear tickets, your staging database
  • The data changes frequently and a stale snapshot would cause errors
  • You need to trigger an action on a remote system, not just provide context

Use a skill when:

  • You want Claude to follow a specific procedure consistently: code reviews, spec writing, migration steps
  • The context is project-specific and changes slowly
  • You want to encode conventions, style guides, or decision trees that apply to this repo
  • The task is something you do repeatedly and want Claude to handle the same way every time

There's a third category worth distinguishing: rules that belong in CLAUDE.md rather than either. Short, always-relevant instructions go in CLAUDE.md. It loads every session, costs a few hundred tokens, and shapes baseline behavior for everything. CLAUDE.md is for the rules. Skills are for the procedures. MCPs are for the connections.

What I Actually Run in Production

On a production React Native codebase, my current setup is deliberately minimal.

MCPs:

  • GitHub (to read issues and PRs during planning phases)
  • Supabase (for live schema inspection during migrations)

That's it. Two servers with specific, irreplaceable use cases.

Everything else is a skill:

  • code-review.md: the checklist I use for every PR on this project
  • commit-message.md: commit format, scope conventions, when to split into multiple commits
  • migration.md: step-by-step for writing and testing Supabase migrations safely
  • blog-pipeline.md: the full pipeline for automated content workflows

No MCP for Figma (I read the design context once and paste it into the session). No MCP for Slack (if something needs communicating, I do it). No MCP for documentation sites (a skill that tells Claude to search official docs is more flexible than a hardwired library-specific MCP).

The result: session initialization is fast and cheap. Skills that don't apply to a given task never load. And when a procedure needs updating, I edit a Markdown file, not a server configuration.

The Part Most Tutorials Skip

Most Claude Code content starts with MCPs because MCPs feel like capability. They connect your AI to the world. Installing one looks like progress.

Skills are less obviously impressive. They're Markdown files in a hidden directory. But they scale in a way MCPs don't. They encode institutional knowledge, team conventions, and repeatable workflows in a form Claude can use without consuming context at startup.

The skills vs mcp claude code trade-off isn't about capability vs. efficiency. It's about understanding that these two mechanisms are solving different problems. Treating an MCP as the default for everything is like running every import at the top of the file regardless of whether it's needed. Technically fine. But wasteful by design.

Use MCPs for live access. Use skills for knowledge. Use CLAUDE.md for the rules that apply everywhere. That division of responsibility is what keeps sessions fast, focused, and predictable over time.

I cover the full .claude/ architecture in the agentic coding course, from skills to hooks to the blueprint that keeps sessions clean.

First chapter free

Learn the agentic coding workflow I use in production

How I set up my repos, manage context, and run agents in production. Written down so you can do the same.