Claude Code · Global Setup · Action Plan

Automatic Compaction & Nested-Agent Context

Synthesized from 6 research agents · for Sarudi Mátyás · 2026-06-24 · decisions locked: safety-net bridge + orchestration hardening + Sonnet/return-size

HARNESS  the CLI enforces it — most reliable HOOK  a script the harness runs — deterministic INSTRUCTION  CLAUDE.md text — model-compliance only
The core idea — stop trying to time compaction perfectly; make it harmless whenever it fires. Your current 25%/40% rule asks the model to self-trigger, but the model cannot measure its own token % — so it's a soft suggestion. Rather than chase a perfect trigger (the one thing the harness won't let you control), we make every compaction lossless: a hook writes the session's irreplaceable state to disk before any compaction, and a second hook reads it back after. The timing stops mattering.

How Claude Code compaction actually works ground truth

Verified against current Anthropic docs + community repos. This is what's real today — and what is not controllable.

MechanismWhat it isReality / limitTier
autoCompactEnabledOn/off switch for harness auto-compaction (default on).No documented % threshold knob. The harness decides when, near the limit. You can only disable it.HARNESS
/compact [focus]Summarizes history in-place, same session; optional focus string.Fires PreCompact → compacts → SessionStart(compact). Re-injects CLAUDE.md + memory; drops un-invoked skill listings.HARNESS
PreCompact hookcommand type can run a script & block; prompt type nudges the summary.Cannot inject text into the summary, and cannot tell manual vs auto apart. Can write state to disk.HOOK
SessionStart matcher:"compact"Fires right after a compaction completes.The re-entry point — wire restoration here. Reactive, not preventive.HOOK
# Compact instructions in CLAUDE.mdNative section telling the harness what to preserve when it summarizes.Documented + survives compaction, but model-controlled — high in practice, not guaranteed.INSTRUCTION
Sub-agents (Agent tool)Each runs in a separate context window; only the return message crosses back.Agent tool blocked inside agents by default. Hard ceiling 5 nesting levels; verbose returns can hard-lock the parent.HARNESS

The architecture — a safety-net bridge

Two small hooks turn the conversation (ephemeral) into a file (durable). The filesystem is unlimited, free context.

① Before compaction — capture
PreCompact fires read transcript write state.json
A command hook dumps the irreplaceable bits to ~/.claude/session-env/<id>-state.json: modified-file manifest, branch + unpushed count, live snapshot vNNN, deploy/test commands, open todos, and any active orchestration slug + round. ≤4 KB, ~170 ms, fail-open.
② After compaction — restore
SessionStart(compact) read state.json inject as context
A sibling SessionStart hook scoped to matcher:"compact" reads the file back and emits it as additionalContext, so the freshly-compacted thread re-learns exactly what it was doing — including which orchestration round it's in.

Why this is the backbone (your pick): it neutralizes the one thing you can't control (the trigger point). Auto-compact firing at an awkward moment — mid-iteration, mid-orchestration — stops being a data-loss event and becomes a no-op. Your 25/40 rule stays as a courtesy trigger, not a load-bearing one.

The change set 8

Concrete edits, ranked by reliability tier. Nothing here depends on undocumented behavior.

#ChangeWhereWhyTier
1PreCompact state-dump hooknew ~/.claude/hooks/precompact-state-dump.py + register under PreCompact in settings.jsonThe durable half of the bridge. Captures what a generic summary drops — critically, your deferred-git modified-file list.HOOK
2SessionStart(compact) restore hooknew ~/.claude/hooks/postcompact-restore.py + register SessionStart matcher:"compact"The reactive half. Re-injects the state file so the compacted thread continues without amnesia.HOOK
3Native # Compact instructions block~/.claude/CLAUDE.md — new section; retire the redundant prompt-type PreCompact hookDocumented mechanism that shapes the summary itself. Belt to the hook's suspenders; one source of the preservation list.INSTRUCTION
4Sonnet compaction threshold~/.claude/CLAUDE.md — extend the Auto-Compaction sectionToday the rule is Opus/Fable-only; your subagents + long iteration sessions are Sonnet. Add a high (~70%) Sonnet line.INSTRUCTION
5Subagent return-size contract~/.claude/CLAUDE.md — Subagent-routing section"≤500-token summary + file paths, never raw dumps/base64; >5 KB → write to disk, return path." Stops 7+ verbose agents hard-locking the parent.INSTRUCTION
6Fan-out / depth caps~/.claude/CLAUDE.md — Subagent-routing section"Fan out one level only; ≤4 parallel; depth target 2, max 3; model-tier the chain." Matches your contention memory.INSTRUCTION
7SubagentStop audit hooknew ~/.claude/hooks/subagent-stop-logger.py + register SubagentStopLogs each spawned agent's id/transcript/status to a jsonl. Gives plan-orchestrate & bnf-qa-ultra a debug trail without re-reading transcripts. Low risk.HOOK
8turn-nudge model-awareness (optional)~/.claude/hooks/turn-nudge.pyTighter cadence + escalating message on Opus sessions. A louder courtesy backstop below the 40% mandatory line.HOOK

Orchestration hardening — where it actually breaks for you

The documented #1 nested-agent failure: the parent auto-compacts mid-chain and "forgets" which round it's in, then re-starts or misroutes the next spawn. You run two skills exposed to this.

plan-orchestrate
Add to the orchestration loop: "Compact only at round boundaries, never mid-round. On resume after compaction, re-read the file-bus (bundle_state.json, round-N-*.json) to recover slug + round + last-subagent before acting." The state-dump hook already snapshots slug+round, so resume is doubly covered.
bugfix-newfeature-qa-ultra
Same checkpoint, keyed to run_state.json: "Before each builder/QA re-spawn, if context is heavy run /compact (state is preserved); never compact mid-task. Resume by re-reading run_state.json." Keeps the human-gate verdicts and PENDING_REVIEW state intact across a compaction.

What we deliberately do not do honesty

✗ No reliance on CLAUDE_AUTOCOMPACT_PCT_OVERRIDE. The community uses this undocumented env var to force early auto-compaction, but it's capped at ~83% and recent bug reports show it's silently ignored on Opus 1M-context sessions — your default model. We may try it as a verify-via-/context experiment, but the plan must not depend on it.
✗ We do not disable auto-compaction. It's the floor that prevents a hard "prompt too long" wall. With the safety-net bridge in place, letting it fire is safe — disabling it would just trade automatic safety for manual babysitting.
✗ No agent-teams / deep-nesting machinery. Experts (Cognition) warn against fanning out interdependent work; your skills already use the right pattern (file-bus + flat-ish fan-out). We harden what exists rather than add a heavier orchestration layer.

Rollout order

Phase 1 · Hooks
#1 dump#2 restore#7 audit
Write the three scripts, register in settings.json. Fail-open everywhere — a hook error must never block a compaction or a session.
Phase 2 · CLAUDE.md
#3 compact-instr#4 Sonnet#5/#6 agents
One careful edit pass: add the native preservation block, the Sonnet threshold, the return-size + fan-out rules; retire the redundant prompt-hook.
Phase 3 · Skills + verify
#8 nudgeorchestrationtest
Patch the two SKILL.md loops, then run a throwaway /compact and confirm the state file appears and restores.

Verification