Claude Code Hooks: CI for Your Agent

Nobody writes "please run the tests before you deploy" in a README and hopes the team complies. You wire it into CI, where it runs every time and doesn't care how anyone feels about it.
Your CLAUDE.md is the README. It is full of rules you are hoping the agent follows, and mostly it does, until a long session fills the context window and compaction quietly drops "always run prettier" to make room. Hooks are the CI. They are commands the harness runs at fixed points in Claude Code's lifecycle, every time, with zero model involvement. It is the principle you already trust, aimed at your agent instead of your teammates.
What a hook actually is
A hook fires at a specific point in the session: before a tool runs, after it completes, when the agent is about to stop. Claude Code passes the event to your handler as JSON, the handler does its work, and an exit code or a small JSON response tells Claude Code whether to proceed. The model does not get a vote.
You configure hooks in settings, not in your prompt. Put them in .claude/settings.json to share with your team, .claude/settings.local.json for personal ones, or ~/.claude/settings.json to apply everywhere. Here is the smallest useful one, a formatter that runs after every edit:
{
"hooks": {
"PostToolUse": [
{
"matcher": "Edit|Write",
"hooks": [
{
"type": "command",
"command": "jq -r '.tool_input.file_path' | xargs -I{} npx prettier --write {}"
}
]
}
]
}
}The matcher is a regex on the tool name, so Edit|Write fires on both and nothing else. The command reads the event JSON from stdin, pulls the file path out with jq, and runs Prettier on it. There is no $CLAUDE_FILE_PATH magic variable to remember. The data arrives on stdin and you pick out what you need.
Most hooks are shell commands like this. Three other handler types take over once you outgrow shell: prompt sends a one-shot evaluation to a fast model, agent spawns a subagent with tool access, and http posts the event to a service. Start with command. Type /hooks in any session to see what is currently wired up.
The three worth wiring up
After months of iterating, three hooks earn their place in every project I work in. Each one covers a moment where an agent reliably drifts.
Format on edit. The one above. It is the best return on effort of any hook I run: five minutes of setup, and unformatted code never reaches a diff again. It runs while the agent works, so the agent sees clean files as it goes instead of a formatter rewriting everything at commit time.
Block dangerous commands. A PreToolUse hook on Bash, pointed at a script that inspects the command before it runs:
{
"hooks": {
"PreToolUse": [
{
"matcher": "Bash",
"hooks": [
{
"type": "command",
"command": "$CLAUDE_PROJECT_DIR/.claude/hooks/block-dangerous.sh"
}
]
}
]
}
}The matcher only narrows to Bash. The actual decision happens in the script, because a matcher cannot see command arguments:
#!/bin/bash
command=$(jq -r '.tool_input.command')
if echo "$command" | grep -qE 'rm -rf|git push --force|drop table'; then
echo "Blocked: matched a destructive pattern" >&2
exit 2
fi
exit 0Exit 0 lets the command through. Exit 2 blocks it and feeds your stderr message back to Claude as the reason, so it understands what happened instead of hitting a wall. If you want finer control, like rewriting the command or asking the user to confirm, return a small JSON object on stdout instead of exiting 2. The exit code is the simple path, and it is the one I reach for first.
Verify before stopping. A Stop hook fires when Claude decides it is done, which is exactly when an agent is most likely to be wrong. Instead of a shell script, point this one at a prompt handler and let a fast model check the work:
{
"hooks": {
"Stop": [
{
"hooks": [
{
"type": "prompt",
"prompt": "Evaluate whether Claude should stop. Context: $ARGUMENTS\n\nRespond {\"ok\": true} if the tests were run and every requested task is complete, or {\"ok\": false, \"reason\": \"...\"} to keep going.",
"timeout": 30
}
]
}
]
}
}$ARGUMENTS injects the event data, including the path to the transcript, so the model can see what actually happened. If it returns {"ok": false}, the agent does not stop. It gets the reason as its next instruction and keeps working. No more "I've completed the changes" sitting on top of a test suite that was never run. This hook changed how I work more than any line I have ever put in CLAUDE.md.
What belongs in a hook, and what doesn't
Hooks are addictive. Once one works, you want to move everything into them. Most of your CLAUDE.md should stay exactly where it is.
The split is about judgment. CLAUDE.md is for rules the model should weigh and apply in context: prefer composition over inheritance, use Server Components by default, reach for @jd/ui before hand-rolling a component. Those call for reasoning, and reasoning is what the model is for. Hooks are for rules with no judgment in them at all: format after editing, never force-push, run the type-checker before a commit lands. There is nothing to think about, so do not make the model think about it.
Two tests help. The first: if breaking the rule once would break production or open a security hole, it is a hook; if breaking it once would just start a code-review conversation, it is CLAUDE.md. The second is faster. If you catch yourself typing ALWAYS or NEVER in capital letters into your CLAUDE.md, the rule has probably outgrown the prompt and wants to be a hook. (Loading knowledge on demand is a third, separate job, and that is what skills are for.)
This boundary, between what you prompt and what you enforce, is the spine of The Feedback Loop module in the agentic coding course.
The trap on the other side
Here is what nobody mentions when they show off their hook setup: every hook fires a command, and the overhead compounds. Pre-check every tool call, post-validate every write, run three evaluations on every stop, and a session that took two minutes takes ten. I went through that phase. The setup above is what I stripped back to.
The deterministic guarantee cuts both ways. A hook does exactly what you told it, every single time, including the times you were wrong. While I was assembling the notes for this article, one of my own hooks blocked the command I was running, because the text it was writing contained git push --force as an example of what to catch. The matcher checks for that string anywhere in a Bash command, and it cannot tell a real force-push from prose describing one. The hook was not broken. It was doing its job with no judgment, which is the entire point and, every so often, the entire problem.
So add a hook only when you have earned it: you have corrected the same mistake twice, tried to fix it in CLAUDE.md, and it still came back. That is the signal. Not the first time something goes wrong, and not "this would be nice to enforce."
One placement note, because it bites people. A hook that guards a project-level rule, like formatting or your dangerous-command list, goes in .claude/settings.json, committed, so every teammate's agent runs the same guardrails. Keep .claude/settings.local.json for personal things: your notification sound, a local tool path.
Stop hoping, start wiring
You already made this move once. You stopped trusting that everyone would remember to run the tests, and you wired them into CI. Hooks are the same decision one layer down: stop trusting that the agent will remember the rules that do not survive a long session, and wire the few that genuinely cannot be skipped into the lifecycle itself.
Format on edit. Block the dangerous stuff. Verify before stopping. Three hooks, about ten minutes, and the rules CLAUDE.md keeps quietly dropping become things your agent cannot ignore.
Hooks are one layer of the feedback loop that makes an agentic setup sharpen itself over time. The full system is in the agentic coding course.
Learn the agentic coding workflow
I use in production
How I set up my repos, manage context, and run agents in production. Written down so you can do the same.