Claude Code, Part 6: From Guidance to Guarantees

Part 5 taught Claude how you want it to behave. This part is what Claude cannot bypass.

CLAUDE.md, rules, skills, subagents: all advisory. Claude reads them and usually follows them, but usually is not always. On a good day, that's fine. On a bad day, it writes to .env.local, force-pushes to main, or commits a bug fix without a test.

This part is about closing that gap: hooks that run as scripts outside Claude's decision loop, patterns that turn "should do" into "can't skip", and a permissions baseline you can drop into a B2B project on day one.

Hooks: From Auto-Format to Safety Gate

Hooks fire on lifecycle events. They run as one of five handler types (listed below). If a PreToolUse hook returns a deny decision, the tool call is blocked before it runs. That's what makes them deterministic.

Types and Events

Five hook types, per the hooks docs at time of writing (verify the doc for additions):

command: shell script, most common.
http: POST to an external endpoint.
mcp_tool: calls a tool on an MCP server.
prompt: model-based yes/no decision.
agent: subagent-based validation.

Events worth knowing about: PreToolUse, PostToolUse, SessionStart, UserPromptSubmit, Stop, PreCompact, PostCompact. The full catalogue in the hooks doc covers a few dozen events and is still growing, so check the doc for the current list, including the task and compaction lifecycle additions. Pick the earliest event that catches the condition you care about.

The Framing That Matters

Most guides show hooks as prettier --write after a file save. That's the easy case and the least interesting one. The high-leverage case is safety gates: non-negotiable rules that must execute regardless of what Claude decides.

A worked example: the classic day-one gate, block writes to .env files, secrets/, and .github/workflows/. Two pieces wire it up. First, an entry in .claude/settings.json registers a PreToolUse hook against the Write|Edit matcher and points it at a script in .claude/hooks/. Second, the script reads the tool input from stdin, extracts tool_input.file_path, checks it against the dangerous patterns, and emits a JSON deny decision on stdout when there's a match. The canonical shape uses hookSpecificOutput with hookEventName, permissionDecision: "deny", and a permissionDecisionReason you'll see in the session. The exact schema and a drop-in bash template live in the hooks docs; copy from there rather than from memory, the field names are easy to mistype.

That's the full shape of every safety hook: parse the tool input, match a dangerous pattern, emit a deny JSON.

Hook script, or deny rule? For a static block like this you usually don't need a script at all: a permissions.deny entry (Write(.env*), Edit(.env*), shown in the baseline below) blocks it declaratively, with no jq and nothing to maintain. Reach for a hook script only when the rule needs logic a deny-list can't express: inspecting the filesystem, parsing a commit message, or reacting to a lifecycle event. The worked example above is worth understanding as the hook mechanism, but the env block itself belongs in a deny rule. The gates below split on exactly that line.

Three Safety Gates Worth Installing

Three more risks worth covering, split across the two mechanisms above (deny rule vs hook script):

Destructive shell commands. Better expressed as permissions.deny entries than a script. In a battle-tested baseline these are refspec-aware: block pushes to main in every form (Bash(git push origin main*), Bash(git push origin +main*), Bash(git push --force origin main*)), plus Bash(rm -rf /*), Bash(rm -rf ~*), Bash(rm -rf $HOME*). A literal match still misses creative variants (rm -fr, find . -delete, an aliased push), so treat the deny list as friction on the obvious mistake and let a server-side protected branch be the real force-push control. A hook script only earns its place here if you need logic the patterns can't capture.
Commits without a proof. Matcher: Bash. If the command contains git commit with fix( or bug( in the message, check that .claude/proofs/*.test.* exists. Deny if not. This nudges the Prove-It pattern below into habit, but keyed on the message prefix and the mere presence of a proof file, it's a convention reminder, not an airtight gate: a differently-worded message slips past, and it never checks the proof matches this bug. For real enforcement, gate it in CI against the changed code.
Auto-format on write. PostToolUse on Write|Edit, matcher pipes the file_path to prettier --write. The easy case, worth including because it pairs naturally with the safety gates.

Notice the split. The first two (the env block and destructive commands) are static patterns, so they belong in permissions.deny with no script at all. The other two need a hook: one for logic a deny-list can't express (the commit-proof check), one for a lifecycle event (auto-format on Write). The wiring is the same: match on the right event and tool, parse the tool input with jq, then act, emitting a deny JSON for the commit-proof gate or just running the formatter. Make the scripts executable (chmod +x .claude/hooks/*.sh) and validate before committing (jq empty < .claude/settings.json && bash -n .claude/hooks/*.sh). Requires jq in your PATH.

Lifecycle Hooks: Survive Compaction and Catch Drift

Safety gates are the defensive use of hooks. The other high-leverage use is lifecycle automation, and the most useful instance is one almost no guide mentions: carrying context across a compaction or a new session.

Long sessions get compacted, and a fresh session starts blank. Two hooks close that gap:

A PreCompact hook writes a snapshot to a gitignored file before Claude compacts: current branch and HEAD, the active task or plan, open PRs (gh pr list), and git status --short. A few lines of shell piping git/gh output into .claude/snapshots/.
A SessionStart hook reads that snapshot back and prints it to stdout, which Claude ingests as context, but only if it's fresh (say, written in the last 24h) so a stale snapshot can't mislead the next session.

The effect: a compacted or next-day session resumes already knowing what you were doing, with no re-explaining. It's deterministic memory the model can't forget to write.

One more lifecycle hook worth wiring: Stop, which fires when Claude finishes responding. Point it at a fast, warn-only consistency check (broken doc cross-references, a quick lint, an orphaned-file scan). It surfaces drift the moment a turn ends, when it's cheapest to fix, without blocking the turn.

The wiring in .claude/settings.json:

{
  "hooks": {
    "PreCompact": [
      { "matcher": "", "hooks": [{ "type": "command", "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/pre-compact.sh" }] }
    ],
    "SessionStart": [
      { "matcher": "", "hooks": [{ "type": "command", "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/session-start.sh" }] }
    ]
  }
}

And the two scripts it points at:

#!/usr/bin/env bash
# .claude/hooks/pre-compact.sh: snapshot context before Claude compacts.
set -uo pipefail
ROOT="${CLAUDE_PROJECT_DIR:-$(pwd)}"
mkdir -p "$ROOT/.claude/snapshots"
{
  echo "# Compact snapshot"
  echo "Branch: $(git -C "$ROOT" rev-parse --abbrev-ref HEAD 2>/dev/null || echo unknown)"
  echo; echo "## Open PRs"
  command -v gh >/dev/null 2>&1 && gh pr list --state open --limit 10 2>/dev/null || echo "(gh unavailable)"
  echo; echo "## Working tree"
  git -C "$ROOT" status --short 2>/dev/null | head -30
} > "$ROOT/.claude/snapshots/compact-snapshot.md"
exit 0

#!/usr/bin/env bash
# .claude/hooks/session-start.sh: re-emit the snapshot (only if fresh) so a new session resumes with context.
set -uo pipefail
ROOT="${CLAUDE_PROJECT_DIR:-$(pwd)}"
SNAP="$ROOT/.claude/snapshots/compact-snapshot.md"
[ -f "$SNAP" ] || exit 0
[ -n "$(find "$SNAP" -mtime -1 2>/dev/null)" ] || exit 0   # last 24h only
echo "--- prior-session snapshot ---"
cat "$SNAP"
echo "--- end snapshot ---"
exit 0

Make them executable (chmod +x .claude/hooks/*.sh) and gitignore .claude/snapshots/.

The Prove-It Pattern

This is a discipline I recommend, not a built-in Claude Code feature. The hooks that enforce it are real primitives (documented in the hooks docs); the pattern and the .claude/proofs/ convention are mine.

A small discipline that pays off disproportionately for AI-assisted debugging, and almost no one bakes it in. The rule: a failing test must exist before the fix, at a predictable path, and it must pass after.

Layout:

.claude/
  proofs/
    auth-null-pointer.test.ts
    stale-tenant-filter.test.ts
    webhook-retry-dedupe.test.ts

Workflow:

The user reports a bug: "login 500s when the session is null."
You or the prove-the-fix skill write a test at .claude/proofs/auth-null-pointer.test.ts that reproduces it. Confirm it fails.
Fix the code. Keep the proof test.
Re-run. Passes. Commit with fix(auth): null-pointer on missing session.

Why it works: the test is evidence you fixed the actual reported bug, not an adjacent one. It stays as a regression forever after. The next refactor can't silently reintroduce the bug.

The hook turns it from a habit you remember into the default path. The "commits without a proof" gate above rejects any fix(…) or bug(…) commit unless .claude/proofs/*.test.* exists. Like any message-keyed local hook, though, it's a strong nudge rather than an unbypassable wall: reword the message, or commit from CI, and it won't catch you. Treat the local hook as the day-to-day reminder and CI as the real gate. Teams usually move these tests into the main suite after a week; the discipline is that they existed before the fix.

Escalation Format and Tiered Classification

The format below and the tier vocabularies that follow are conventions I use to make Claude's stop-and-ask behavior predictable. They are not Claude Code features; no runtime parses ⛔ GATE or a tier name. They work because the skill instructs Claude to produce them, and you train yourself to read them.

⛔ GATE in a skill without a concrete output template is aspirational. Give Claude a format for when it must stop and ask.

The Escalation Template

Embed this in skills that might need to escalate. Claude learns to produce the format consistently, and you get a predictable decision point instead of vague hand-waving.

SCOPE CHANGE DETECTED

Context: You asked me to fix the login null-pointer crash. While
investigating I found session-store.ts uses `any` for the user
object, and the crash reproduces in signup and password-reset too.

Reason: Fixing this narrowly hides a shared bug. Fixing it broadly
touches 4 files outside the original scope.

Options:
  A) Fix narrowly. Patch login only. ~5 min.
     Risk: signup and reset remain broken.
  B) Fix the root cause. Type the session store correctly.
     ~30 min. Touches 4 files outside the original scope.
  C) Document and defer. Log an issue with the repro, keep the
     login fix scoped, open a follow-up ticket.

Waiting for your decision. Reply A, B, or C.

Four parts: context, reason, labeled options with tradeoffs, the waiting-line. Without the labels, you get a debate. With them, you get a decision.

Tiered Classification

Binary pass/fail forces the model to pick one, even when the right answer is "this is risky, here's the tier, here's what's needed to approve." Tiers give the model and the human a shared vocabulary for calibrated decisions.

Schema changes as the worked example (because B2B and Drizzle):

Tier	Example	Requires
ADDITIVE	New nullable column, new index	Standard review
NON-BREAKING	Rename with backward-compat alias	Review + migration test
BREAKING	Drop referenced column, non-null add	Spec + `/code-review ultra` + sign-off
DESTRUCTIVE	`DROP TABLE`, `TRUNCATE`	Verified backup + explicit human approval

The same shape applies elsewhere. Security findings: NEVER / ASK FIRST / ALWAYS. Review findings: BLOCKER / WARN / NOTE. Pick the domains where binary answers hide risk and give them a tier vocabulary.

`.claude/` Anatomy and a Permissions Baseline

Team-shared layout (committed to git):

.claude/
├── settings.json        # Permissions, hooks
├── settings.local.json  # Your overrides (gitignored)
├── rules/               # Path-scoped instructions
├── skills/              # Task procedures
├── agents/              # Subagent definitions
├── hooks/               # Hook scripts
└── proofs/              # (convention) Bug-fix test artifacts

Personal config under ~/.claude/ mirrors the same structure across all your projects. Auto-memory lives at ~/.claude/projects/<project>/memory/ and is machine-local.

Two starting points exist. auto permission mode (a research preview at time of writing) runs a classifier model over each tool call and only prompts when something looks risky: scope escalation, unknown infrastructure, hostile-content-driven actions. Launch with claude --permission-mode auto, or set permissions.defaultMode: "auto" in your user settings (~/.claude/settings.json; it's ignored from project/local settings). Eligibility has a few gates: it needs a recent model and the direct Anthropic API (not Bedrock/Vertex/Foundry), and on Team/Enterprise an admin must enable it first. The exact requirements move with each release, so check the permission-modes docs for the current list. The alternative, explicit allow/deny lists, is what I describe below, and it's the right choice when you want hard guarantees on specific commands rather than a classifier's judgment. You can use both: auto mode handles the long tail, your deny list handles the non-negotiables.

A permissions baseline lives in .claude/settings.json as two short lists under a permissions key. The allow list covers safe reads (Read, Grep, Glob) and your common dev commands (test, lint, git status, git diff) so they run without a prompt. The deny list raises a hard stop on things Claude should never do: rm -rf, pushes to main in every form (the plain push, the +main refspec, and --force), fetches via curl/wget, reads of .env* and secrets/, and writes to .env*, .github/workflows/, your data directory, and database files (**/*.db, **/*.sqlite*). Two honest caveats, since deny rules match the command string: a name-based block on curl/wget is a guardrail against the obvious/accidental case, not egress control: a model can still reach the network through an alternate binary, a script it writes, or another language, so real egress control needs OS/network sandboxing (see the sandboxing docs or a network-restricted container). Likewise the rm -rf and git push --force denies catch the common form but not every variant; pair force-push protection with a server-side protected branch and treat these rules as friction, not a wall. Settings also supports an ask array for rules that should always prompt before running, and it's the right home for irreversible-but-sometimes-wanted actions: a battle-tested entry is ask: ["Bash(gh pr merge*)"], so a merge always pauses for a human even after the surrounding git commands are allowed. For everything else, default mode already prompts on first use of anything not matched by allow or deny.

A minimal working baseline to copy into .claude/settings.json (every line is a valid rule):

{
  "$schema": "https://json.schemastore.org/claude-code-settings.json",
  "permissions": {
    "allow": [
      "Bash(pnpm test *)",
      "Bash(pnpm lint *)",
      "Bash(git status)",
      "Bash(git diff *)"
    ],
    "deny": [
      "Write(.env)", "Write(./.env)", "Edit(.env)", "Edit(./.env)",
      "Write(.env.local)", "Write(./.env.local)", "Edit(.env.local)", "Edit(./.env.local)",
      "Write(.env.production*)", "Edit(.env.production*)",
      "Write(secrets/**)", "Edit(secrets/**)",
      "Write(.github/workflows/**)", "Edit(.github/workflows/**)",
      "Write(data/**)", "Edit(data/**)",
      "Write(**/*.db)", "Write(**/*.sqlite)",
      "Bash(rm -rf /*)", "Bash(rm -rf ~*)", "Bash(rm -rf $HOME*)",
      "Bash(git push origin main*)",
      "Bash(git push origin +main*)",
      "Bash(git push --force origin main*)"
    ],
    "ask": [
      "Bash(gh pr merge*)"
    ]
  }
}

The ./-prefixed duplicates are intentional: Claude may reference a path either way, so deny both forms. Enumerate your own .env.* variants (.env.staging*, .env.test*, …) the same way, and swap the allow commands for your stack (poetry, go, bundle).

Manage interactively with /permissions. The rule syntax (parenthesized patterns like Bash(pnpm test *) and gitignore-style globs) is documented in the permissions docs. Read it carefully and validate your patterns empirically, because they over-match easily (a careless Bash(pnpm test*) can also match pnpm testify). Never lean on an allow pattern to constrain argument content; it's a convenience, not a safety boundary. Stack swaps: pnpm becomes poetry (Python), go (Go), bundle (Rails). The deny list is stack-independent; keep it as-is.

Plugin layer. When your skills, hooks, and agents stabilize, bundle them as a plugin. Plugin skills are namespaced (plugin-name:skill-name); project and user skills are not and can collide. When skill names clash across scopes, precedence is enterprise > personal > project (subagents resolve in a different order). Agent Teams is a separate coordination feature; see the agent-teams docs and check current maturity before relying on it in production.

Verification: Smoke Tests After Setup

Run these four once after setup and on any settings change. Each has an expected behavior and a failure signal.

#	Test	Expected	Failure signal
1	Fix a typo in README.md	Claude edits without invoking `spec-before-build`.	Planning skill triggers on a typo.
2	Ask Claude to write `.env.local`	Denied by the `.env*` deny rule before the write runs.	File is written.
3	Stage `git commit -m "fix(auth): X"` with no `.claude/proofs/.test.`	Hook denies: requires a test in `.claude/proofs/`.	Commit succeeds.
4	`git push --force origin main`	Deny rule blocks the push to `main`.	Push succeeds.

Tests 1 and 3 assume you've installed skills from Part 5 (spec-before-build and prove-the-fix); skip or substitute if you haven't yet. Note that Test 4 exercises only the literal git push --force form; variants like --force-with-lease and the +main refspec are exactly why force-push protection ultimately belongs on a server-side protected branch, not only in a local hook.

If any test fails, check troubleshooting below.

Troubleshooting

Symptom	Likely cause	Fix
Hook doesn't fire	`jq` missing, script not executable, invalid JSON	`which jq && chmod +x .claude/hooks/*.sh && jq empty < .claude/settings.json`
CLAUDE.md seems ignored	Over ~200 lines, conflicting rules, or excluded	Run `/memory` to list loaded files. Split via `@` imports or `.claude/rules/`.
Rule file not loading when expected	Path glob mismatch, file in excluded scope, lazy load	Wire an `InstructionsLoaded` hook to log each file as it loads: it fires at session start and on lazy-load, telling you exactly which rule resolved and why.
"Permission denied" on a legitimate command	Not in `allow`, not yet auto-approved	Add to `.claude/settings.local.json` (personal) or `settings.json` (team).
Subagent returns empty or generic output	`skills:` frontmatter missing, or body too vague	Add required skill to frontmatter; rewrite body with a concrete single-sentence task.
GitHub Action "resource not accessible"	Repo permissions missing	Re-run `/install-github-app`; grant PR read/write + issue write.

Useful diagnostics: /doctor (install health), /context (what's loaded now), /memory (which instruction files), /hooks (active hook config).

Rollback and Deferred Fixes

Rules will sometimes block legitimate work. The wrong response is a silent workaround. The right response is explicit deferral.

In code, use a SKIP marker with a ticket ID:

// SKIP(ENG-1234): tenant filter audit relaxed for incident response.
// Re-enable after Q2 refactor.

In PR descriptions, add a Deferred fixes section listing what you bypassed and why. This keeps the audit trail intact. Silent workarounds create the kind of drift that surfaces six months later as a SOC2 finding.

Start Here

First hour. Run /doctor to confirm your install. Write the permissions baseline described above into .claude/settings.json (follow the syntax in the permissions docs) and commit it. Two files, five minutes.

First week. Adopt the Prove-It pattern for the next real bug. Write a failing test at .claude/proofs/<slug>.test.ts before touching code. Keep it after the fix lands.

After that. Read the skills docs top to bottom, then the hooks docs for the full hook catalogue. Community collections exist too (search GitHub for "awesome claude code"), but treat any third-party skill or hook as untrusted executable code: it runs in your environment with your credentials, so read every line before adopting anything, pin to a reviewed commit, and never install an unaudited hook script.

This lesson is from the Claude Code Field Guide, a free companion to Claude and the Code. The book teaches the thinking behind the commands.

You've completed the Claude Code Field Guide.

6 lessons. One system. Now go deeper.

The book teaches these concepts through a workplace story. 9 chapters. One team's journey from skepticism to mastery.

Get the Book — $17

← Back to Home