Auto-Compaction & Nested-Agent Context

HARNESS the CLI enforces it — most reliable HOOK a script the harness runs — deterministic INSTRUCTION CLAUDE.md text — model-compliance only

The core idea — stop trying to time compaction perfectly; make it harmless whenever it fires. Your current 25%/40% rule asks the model to self-trigger, but the model cannot measure its own token % — so it's a soft suggestion. Rather than chase a perfect trigger (the one thing the harness won't let you control), we make every compaction lossless: a hook writes the session's irreplaceable state to disk before any compaction, and a second hook reads it back after. The timing stops mattering.

How Claude Code compaction actually works ground truth

Verified against current Anthropic docs + community repos. This is what's real today — and what is not controllable.

Mechanism What it is Reality / limit Tier

autoCompactEnabled On/off switch for harness auto-compaction (default on). No documented % threshold knob. The harness decides when, near the limit. You can only disable it. HARNESS

/compact [focus] Summarizes history in-place, same session; optional focus string. Fires PreCompact → compacts → SessionStart(compact). Re-injects CLAUDE.md + memory; drops un-invoked skill listings. HARNESS

PreCompact hook command type can run a script & block; prompt type nudges the summary. Cannot inject text into the summary, and cannot tell manual vs auto apart. Can write state to disk. HOOK

SessionStart matcher:"compact" Fires right after a compaction completes. The re-entry point — wire restoration here. Reactive, not preventive. HOOK

# Compact instructions in CLAUDE.md Native section telling the harness what to preserve when it summarizes. Documented + survives compaction, but model-controlled — high in practice, not guaranteed. INSTRUCTION

Sub-agents (Agent tool) Each runs in a separate context window; only the return message crosses back. Agent tool blocked inside agents by default. Hard ceiling 5 nesting levels; verbose returns can hard-lock the parent. HARNESS

Mechanism	What it is	Reality / limit	Tier
`autoCompactEnabled`	On/off switch for harness auto-compaction (default on).	No documented % threshold knob. The harness decides when, near the limit. You can only disable it.	HARNESS
`/compact [focus]`	Summarizes history in-place, same session; optional focus string.	Fires PreCompact → compacts → SessionStart(compact). Re-injects CLAUDE.md + memory; drops un-invoked skill listings.	HARNESS
PreCompact hook	`command` type can run a script & block; `prompt` type nudges the summary.	Cannot inject text into the summary, and cannot tell manual vs auto apart. Can write state to disk.	HOOK
SessionStart `matcher:"compact"`	Fires right after a compaction completes.	The re-entry point — wire restoration here. Reactive, not preventive.	HOOK
`# Compact instructions` in CLAUDE.md	Native section telling the harness what to preserve when it summarizes.	Documented + survives compaction, but model-controlled — high in practice, not guaranteed.	INSTRUCTION
Sub-agents (Agent tool)	Each runs in a separate context window; only the return message crosses back.	Agent tool blocked inside agents by default. Hard ceiling 5 nesting levels; verbose returns can hard-lock the parent.	HARNESS

The architecture — a safety-net bridge

Two small hooks turn the conversation (ephemeral) into a file (durable). The filesystem is unlimited, free context.

① Before compaction — capture

PreCompact fires→ read transcript→ write state.json

A command hook dumps the irreplaceable bits to ~/.claude/session-env/<id>-state.json: modified-file manifest, branch + unpushed count, live snapshot vNNN, deploy/test commands, open todos, and any active orchestration slug + round. ≤4 KB, ~170 ms, fail-open.

② After compaction — restore

SessionStart(compact)→ read state.json→ inject as context

A sibling SessionStart hook scoped to matcher:"compact" reads the file back and emits it as additionalContext, so the freshly-compacted thread re-learns exactly what it was doing — including which orchestration round it's in.

Why this is the backbone (your pick): it neutralizes the one thing you can't control (the trigger point). Auto-compact firing at an awkward moment — mid-iteration, mid-orchestration — stops being a data-loss event and becomes a no-op. Your 25/40 rule stays as a courtesy trigger, not a load-bearing one.

The change set 8

Concrete edits, ranked by reliability tier. Nothing here depends on undocumented behavior.

# Change Where Why Tier

1 PreCompact state-dump hook new ~/.claude/hooks/precompact-state-dump.py + register under PreCompact in settings.json The durable half of the bridge. Captures what a generic summary drops — critically, your deferred-git modified-file list. HOOK

2 SessionStart(compact) restore hook new ~/.claude/hooks/postcompact-restore.py + register SessionStart matcher:"compact" The reactive half. Re-injects the state file so the compacted thread continues without amnesia. HOOK

3 Native # Compact instructions block ~/.claude/CLAUDE.md — new section; retire the redundant prompt-type PreCompact hook Documented mechanism that shapes the summary itself. Belt to the hook's suspenders; one source of the preservation list. INSTRUCTION

4 Sonnet compaction threshold ~/.claude/CLAUDE.md — extend the Auto-Compaction section Today the rule is Opus/Fable-only; your subagents + long iteration sessions are Sonnet. Add a high (~70%) Sonnet line. INSTRUCTION

5 Subagent return-size contract ~/.claude/CLAUDE.md — Subagent-routing section "≤500-token summary + file paths, never raw dumps/base64; >5 KB → write to disk, return path." Stops 7+ verbose agents hard-locking the parent. INSTRUCTION

6 Fan-out / depth caps ~/.claude/CLAUDE.md — Subagent-routing section "Fan out one level only; ≤4 parallel; depth target 2, max 3; model-tier the chain." Matches your contention memory. INSTRUCTION

7 SubagentStop audit hook new ~/.claude/hooks/subagent-stop-logger.py + register SubagentStop Logs each spawned agent's id/transcript/status to a jsonl. Gives plan-orchestrate & bnf-qa-ultra a debug trail without re-reading transcripts. Low risk. HOOK

8 turn-nudge model-awareness (optional) ~/.claude/hooks/turn-nudge.py Tighter cadence + escalating message on Opus sessions. A louder courtesy backstop below the 40% mandatory line. HOOK

#	Change	Where	Why	Tier
1	PreCompact state-dump hook	new `~/.claude/hooks/precompact-state-dump.py` + register under `PreCompact` in `settings.json`	The durable half of the bridge. Captures what a generic summary drops — critically, your deferred-git modified-file list.	HOOK
2	SessionStart(compact) restore hook	new `~/.claude/hooks/postcompact-restore.py` + register `SessionStart` `matcher:"compact"`	The reactive half. Re-injects the state file so the compacted thread continues without amnesia.	HOOK
3	Native `# Compact instructions` block	`~/.claude/CLAUDE.md` — new section; retire the redundant `prompt`-type PreCompact hook	Documented mechanism that shapes the summary itself. Belt to the hook's suspenders; one source of the preservation list.	INSTRUCTION
4	Sonnet compaction threshold	`~/.claude/CLAUDE.md` — extend the Auto-Compaction section	Today the rule is Opus/Fable-only; your subagents + long iteration sessions are Sonnet. Add a high (~70%) Sonnet line.	INSTRUCTION
5	Subagent return-size contract	`~/.claude/CLAUDE.md` — Subagent-routing section	"≤500-token summary + file paths, never raw dumps/base64; >5 KB → write to disk, return path." Stops 7+ verbose agents hard-locking the parent.	INSTRUCTION
6	Fan-out / depth caps	`~/.claude/CLAUDE.md` — Subagent-routing section	"Fan out one level only; ≤4 parallel; depth target 2, max 3; model-tier the chain." Matches your contention memory.	INSTRUCTION
7	SubagentStop audit hook	new `~/.claude/hooks/subagent-stop-logger.py` + register `SubagentStop`	Logs each spawned agent's id/transcript/status to a jsonl. Gives plan-orchestrate & bnf-qa-ultra a debug trail without re-reading transcripts. Low risk.	HOOK
8	turn-nudge model-awareness (optional)	`~/.claude/hooks/turn-nudge.py`	Tighter cadence + escalating message on Opus sessions. A louder courtesy backstop below the 40% mandatory line.	HOOK

Orchestration hardening — where it actually breaks for you

The documented #1 nested-agent failure: the parent auto-compacts mid-chain and "forgets" which round it's in, then re-starts or misroutes the next spawn. You run two skills exposed to this.

plan-orchestrate

Add to the orchestration loop: "Compact only at round boundaries, never mid-round. On resume after compaction, re-read the file-bus (bundle_state.json, round-N-*.json) to recover slug + round + last-subagent before acting." The state-dump hook already snapshots slug+round, so resume is doubly covered.

bugfix-newfeature-qa-ultra

Same checkpoint, keyed to run_state.json: "Before each builder/QA re-spawn, if context is heavy run /compact (state is preserved); never compact mid-task. Resume by re-reading run_state.json." Keeps the human-gate verdicts and PENDING_REVIEW state intact across a compaction.

What we deliberately do not do honesty

✗ No reliance on CLAUDE_AUTOCOMPACT_PCT_OVERRIDE. The community uses this undocumented env var to force early auto-compaction, but it's capped at ~83% and recent bug reports show it's silently ignored on Opus 1M-context sessions — your default model. We may try it as a verify-via-/context experiment, but the plan must not depend on it.

✗ We do not disable auto-compaction. It's the floor that prevents a hard "prompt too long" wall. With the safety-net bridge in place, letting it fire is safe — disabling it would just trade automatic safety for manual babysitting.

✗ No agent-teams / deep-nesting machinery. Experts (Cognition) warn against fanning out interdependent work; your skills already use the right pattern (file-bus + flat-ish fan-out). We harden what exists rather than add a heavier orchestration layer.

Rollout order

Phase 1 · Hooks

#1 dump→#2 restore→#7 audit

Write the three scripts, register in settings.json. Fail-open everywhere — a hook error must never block a compaction or a session.

Phase 2 · CLAUDE.md

#3 compact-instr→#4 Sonnet→#5/#6 agents

One careful edit pass: add the native preservation block, the Sonnet threshold, the return-size + fan-out rules; retire the redundant prompt-hook.

Phase 3 · Skills + verify

#8 nudge→orchestration→test

Patch the two SKILL.md loops, then run a throwaway /compact and confirm the state file appears and restores.

Verification

Round-trip: in a scratch session, /compact manually → confirm ~/.claude/session-env/<id>-state.json is written, then confirm the post-compact context contains the restored block.
Fail-open: force a hook error (e.g. bad path) → compaction and session start still succeed; the error is logged, not raised.
No startup tax: confirm the new hooks add negligible latency to SessionStart (the audit + restore scripts are tiny, <15 s timeouts).
Orchestration resume: mid-plan-orchestrate dummy run, force a compact at a round boundary → orchestrator resumes on the correct round from the file-bus.
Return-size: spot-check that research/QA subagents return summaries + paths, not raw dumps (this thread already did — 6 agents, compact returns).