← Back to Blog

The Dual Voice Architecture

May 2026 • 5 min

Tier 2 Architecture Seed Agent Plugins Plugin Kit Voice

By Hadi Nayebi & Claude Opus 4.7

Audio narration pending — written-only essay (5 min read).

Essay 7.3 — The Plugin Kit, Part 3 of 9.

Essay 7.2 opened the universal skeleton — CLAUDE.md, hooks, scripts — and named the PLUGIN-LOCK ceremony that gates every edit to them. This sub-essay opens the organ that almost every plugin doubles: voice.xml. The hooks-side and scripts-side surfaces share one XML schema across different audiences. Getting them confused is the most common new-user error in plugin authoring.

`voice.xml` × Two — The Dual Surface

This is the organ that confuses new users most, and the one where the relational anatomy matters most.

The two voice surfaces are different files with different audiences. hooks/voice.xml and scripts/voice.xml look nearly identical structurally (both XML, both with <coaching>, <block>, <info>, <entry> elements identified by id), but they serve different consumers. ^ⓘ

`hooks/voice.xml` — The Agent-Facing Surface

What it is. Strings the plugin emits at hook fire-time. Coaching voices that nudge the agent at a specific moment (entering a phase, crossing a context tier, having just dispatched a subagent). Blocks that refuse the agent when a guard fires. ^ⓘ

Who reads it. The LLM agent. Voices delivered via hooks land in the agent’s context window — either as soft injections via additionalContext JSON (coaching) or as exit-2 stderr refusals (blocks). The LLM sees the string mid-turn and reacts. ^ⓘ

Who writes it. Mostly CONDENSE step 4 (which consumes [VOICE-UPDATE] markers other phases emit). The historian subagent also updates voice files when the plugin’s evolution requires new coaching. PLUGIN-LOCK is the underlying gate for any edit. ^ⓘ

What it depends on. The shared lib/voice-helper/ helper (the get_voice function that loads voice.xml entries and substitutes {{var}} placeholders). ^ⓘ

`scripts/voice.xml` — The Operator-Facing Surface

What it is. Strings the plugin’s CLI prints to the operator’s terminal. When safe-lock.sh reverts a test failure, the operator sees a short status line in their terminal. The text is what the operator’s eyes read in the shell. ^ⓘ

Who reads it. The human operator. Terminal output. Not injected into the LLM’s context — printed to stderr or stdout of the shell, visible only to the person at the keyboard. ^ⓘ

Who writes it. Same ceremony as hooks/voice.xml.

What it depends on. Same voice-helper.

Why the split matters

Same intent — both surfaces carry the plugin’s voice. Different audiences — the LLM consumes structured paragraphs that frame the situation and propose action; the human consumes status lines that flag what happened. Wording often differs between the two for the same conceptual event. Auditor scripts that grep voice ids have to look in both files; auditing only one false-positives every id that lives in the other. ^ⓘ

Image pending — Coaching (soft, probabilistic) on the left, block (hard, deterministic) on the right, with a curving migration arrow between them

Prompt: ASSET: images/dual-voice-b7-3.png Style: Match opevc-cycle-blackboard.png exactly. Dark slate chalkboard background; hand-drawn chalk lines; pastel chalk for the two column panels (cyan = coaching/soft, magenta = block/hard); white chalk for ALL labels, XML tag text, arrows, and the migration caption; faint chalk dust at the edges; chalk sticks along the bottom. IMPORTANT: Use only the literal text strings listed below. Do not invent or substitute any other XML tag names, voice names, or descriptors. Treat angle brackets in XML tags as literal text. Layout: Two vertical chalk panels side by side across the board. Left panel (cyan border, header IN WHITE CHALK reads exactly "coaching — soft layer"). Inside the panel, four short white-chalk lines stacked top to bottom: Line 1 (drawn as a small chalk XML element): "" Line 2: "injected into LLM context" Line 3: "probabilistic — can be ignored" Line 4: "LLM-interpreted" Below the four lines, draw a small chalk speech-bubble icon (the soft-nudge symbol). Right panel (magenta border, header IN WHITE CHALK reads exactly "block — hard layer"). Inside the panel, four short white-chalk lines stacked top to bottom: Line 1 (drawn as a small chalk XML element): "" Line 2: "stderr refusal" Line 3: "exit 2 — deterministic" Line 4: "agent’s tool call fails" Below the four lines, draw a small chalk X-over-toolbox icon (the refusal symbol). Between the two panels, draw a single curving white-chalk arrow that arcs from the bottom of the left panel UP and OVER to the top of the right panel, with one short caption riding along the arrow’s curve: "measurement → harden". Below both panels, a horizontal chalk note IN WHITE CHALK reads exactly: "Lock 13: over-engineering veto — soft must measurably fail before hard lands". Keep every line hand-drawn and slightly imperfect, never ruler-straight. STRICT NAME WHITELIST — the image must contain only these literal text strings as labels: "coaching — soft layer", "", "injected into LLM context", "probabilistic — can be ignored", "LLM-interpreted", "block — hard layer", "", "stderr refusal", "exit 2 — deterministic", "agent’s tool call fails", "measurement → harden", "Lock 13: over-engineering veto — soft must measurably fail before hard lands", plus the caption below. No other words, file names, voice ids, or descriptors may appear.

Image 7.3. Soft layer coaches. Hard layer refuses. Patterns migrate left to right when data warrants.

Soft vs hard at the element level. Inside each voice.xml, the <coaching> element produces a context injection — probabilistic, the LLM may absorb or ignore it. The <block> element produces a refusal — exit-2 stderr that fails the agent’s tool call. The two element types coexist in the same file. The Lock-13 over-engineering veto says: new behavioral controls start as coaching; only when measurement shows coaching consistently fails does the control harden into a block. ^ⓘ

The yaml-injection pairing. When a job reaches stage 3 of maturation (multi-cycle with .yaml plan, covered in Essay 8), the .yaml file’s per-phase fields pair with voice ids by convention. The yaml field name maps to a voice id directly; voice-helper appends the matched value to the rendered voice text at phase entry. New yaml fields require no parser change — just add the matching voice id to the plugin’s hooks/voice.xml and the seed agent picks up the new pairing on the next phase entry. ^ⓘ

The new-plugin lens. When you guide your seed to add coaching for a new behavior, the seed writes the coaching string to hooks/voice.xml first (cheap, soft). If the operator later observes the coaching consistently fails to hold, the seed authors a <block> element in the same file for the hard variant — same id namespace, harder enforcement. The seed never invents a third voice surface. The split between LLM-facing and operator-facing voice surfaces is structural. ^ⓘ

A consulting firm’s seed agent could carry the same dual-voice split: hooks/voice.xml coaches the agent on deliverable-checklist enforcement; scripts/voice.xml prints terminal status to the consultant when deliverable.sh validate runs. Same schema, the agent-facing and operator-facing surfaces, one ceremony. ^ⓘ

The hooks-side and scripts-side surfaces share one schema. The LLM consumes the agent-facing surface; the operator consumes the CLI-facing surface. The next sub-essay opens the organ that almost every plugin needs but that no plugin lets anyone else touch — the private data.json state.

Essay 7.3 — The Plugin Kit, Part 3 of 9.

Previous: Essay 7.2 — Skeleton: CLAUDE.md, Hooks, and Scripts — the universal organs governed by PLUGIN-LOCK. Next: Essay 7.4 — data.json — The Hidden State — per-plugin private state, script-mediated.

The Dual Voice Architecture

voice.xml × Two — The Dual Surface

hooks/voice.xml — The Agent-Facing Surface

scripts/voice.xml — The Operator-Facing Surface

Why the split matters

Comments

`voice.xml` × Two — The Dual Surface

`hooks/voice.xml` — The Agent-Facing Surface

`scripts/voice.xml` — The Operator-Facing Surface