← Back to Blog

VERIFY — Independent Eyes

May 2026 • 9 min

Tier 2 → Tier 3 Architecture Seed Agent OPEVC Phases Verify

By Hadi Nayebi & Claude Opus 4.7

Audio narration pending — written-only essay (9 min read).

Essay 6.6 — The Markov Phasic Brain, Part 6 of 10.

Essay 6.5 closed with EXECUTE handing forward — the project files written, the execution notes filed in the working CLAUDE.md, the cycle’s deliverable on disk. VERIFY is the phase that decides whether the deliverable is right.

VERIFY is the cycle’s independent-eyes phase. Where EXECUTE built the work, VERIFY judges it against the plan — and the same agent that built the work is not allowed to fix what VERIFY finds. The cognitive failure VERIFY prevents is self-verification bias: the hand that built the code wants to see the code as correct, the same way a writer who proofreads their own draft misses errors a stranger would catch instantly. The prototype’s answer is a separate phase with a separate tool surface, a separate cognitive posture, and a roster of read-only auditor subagents whose entire job is to evaluate one slice of the cycle’s work and report back. If something fails, VERIFY’s only legal response is to route the cycle backward — to EXECUTE for a small fix, to PLAN for a design correction, to OBSERVE for a context reset. VERIFY cannot quietly amend the artifact it just judged.

The sources are the cycle itself. VERIFY reads the plan document the cycle was supposed to satisfy, reads the altered-list CLAUDE.md files for the execution notes and per-directory context, walks the commit graph for the cycle’s history, and inspects the project files EXECUTE produced. It dispatches its own family of subagents — a verify-observe-auditor that evaluates whether OBSERVE gathered enough context, a verify-plan-auditor that evaluates whether the plan was buildable, a verify-execute-auditor that evaluates implementation fidelity, a verify-git-historian that evaluates checkpoint cadence, a verify-code-evolution-tracker that evaluates the change’s structural quality. A family of perspectives, deliberately composed so no single one dominates the verdict. ^ⓘ

The write side is the tightest in the cycle, but it is not absent — and the common shorthand "VERIFY cannot edit files" oversimplifies it. VERIFY cannot edit project source: code, scripts, configuration are off-limits at the tool boundary. Inside the brain the authority is scoped. VERIFY can append findings to CLAUDE.md files beneath its own footer anchor ---Ve---. Because ---Ve--- is the lowest anchor in the cascading-downward enforcement that all phase guards share, VERIFY’s CLAUDE.md authority bottoms out at its own section — there is no lower section it could cascade into. The other phases' footers stay above the ---Ve--- marker and therefore stay read-only to VERIFY. ^ⓘ VERIFY can refine the plan document while the plan is in its drafting state. It can update memory files. And it can ask the user to approve the plan via the registered [PLAN-APPROVAL] or [YAML-APPROVAL] prefix — a phase-gated authority no other phase carries. The scoping is what makes VERIFY structurally trustworthy: VERIFY can only report and route, never silently fix.

Pacing follows the cycle’s shared shape. The multiplier sentinel locks every tool at phase entry until the agent commits to a depth — the same prevent-silent-skipping rationale that gates OBSERVE, PLAN, and EXECUTE. ^ⓘ Smaller numbers declare a deeper phase: a multiplier of 0.5 means each action awards half the base points, so more actions are needed to fill the 100-point phase-transition bucket. Larger numbers (2 to 3) award proportionally more and the bucket fills with fewer actions — a faster, surface-level pass. Point gates pace the read/write rhythm, and a direct-action budget biases the main session toward auditor-subagent dispatch.

The Bash whitelist is the tightest of any phase — tests/ scripts, scripts/ scripts, git read-only commands, and a handful of plugin-published query scripts. No git add, no shell writes, no package managers. ^ⓘ

The rest of this essay opens VERIFY’s coupled authorities, walks the scoped edit surfaces, names the auditor family, and traces the outcomes — clean pass, minor fix, deep reset, plan-state flip.

Coupled authorities

VERIFY carries coupled authorities the other phases don’t. It is the only phase that can edit the plan files in their respective drafting stages — the .md while plan_state is drafting, the .yaml while plan_state is yaml_drafting — and it is the only phase that can ask the user to approve them. ^ⓘ

The approval mechanism is mechanical: VERIFY asks via AskUserQuestion with a [PLAN-APPROVAL] or [YAML-APPROVAL] prefix, and on a "yes" the phase calls plan.sh approve-md or plan.sh approve-yaml, flipping plan_state forward. The prefixes are phase-gated — they are rejected anywhere outside VERIFY. ^ⓘ

Both authorities flow from the same role: VERIFY is the cycle’s final guardrail, the phase that refines the artifact and then stands between it and the state flip that promotes it. ^ⓘ

VERIFY’s scoped edit authority

A common shorthand says "VERIFY cannot edit files." It is half-right. The precise rule is that VERIFY cannot edit project source — code, scripts, configuration, anything outside the brain. Inside the brain, VERIFY’s write authority is scoped, not absent, and the scope is bounded. ^ⓘ

The first is the plan files in their respective drafting stages, just covered — the .md while plan_state is drafting, the .yaml while plan_state is yaml_drafting, each scoped to the focused job’s declared plan_file. Edits to any other plan, or to a plan in any other state, are rejected at the tool boundary. ^ⓘ

The second is the altered-list CLAUDE.md files — the same scope EXECUTE just finished writing into, inherited by VERIFY at phase entry. The agent appends its pass/fail findings to those CLAUDE.md files, but the edit lands strictly below ---Ve---. Cross-section edits — attempts to amend OBSERVE’s or PLAN’s or EXECUTE’s footer record — are rejected by the same shared section-check library every phase guard uses. VERIFY cannot rewrite history; it can only file its report under its own anchor. ^ⓘ

The third is the memory files — the user-side persistence layer under .claude/projects/.../memory/. Memory edits land in any phase, including VERIFY, because the memory layer is the always-on persistence the seed agent uses for cross-cycle and cross-session learning; the phase-specific guards explicitly exempt memory paths from their scope checks. ^ⓘ

The right shorthand, then: VERIFY’s edit authority is scoped. It can refine the plan, file its findings under its own footer, and update the memory layer. It cannot touch project source, cannot create new CLAUDE.md files, and cannot rewrite the prior phases' footer records. The scope is what makes VERIFY’s verdict structurally trustworthy — VERIFY cannot quietly fix what it found wrong; it can only report and route.

The auditor family

Auditor subagents are read-only researchers, scoped narrowly to one slice of the cycle. The prototype ships an initial set — one per cycle slice — and the set grows as new perspectives prove worth auditing. ^ⓘ

One re-walks the working memory the OBSERVE phase produced and asks whether it gave PLAN enough to work with. One judges the plan itself — were the items specific enough to implement without guessing, were the acceptance criteria testable, would another agent have reached the same implementation from these instructions, or did the plan leave gaps that forced EXECUTE to improvise. ^ⓘ

One walks the cycle’s commit graph and asks whether the checkpoints tell a coherent story. One compares what was built against what the plan specified — flagging over-engineering, missed items, and silent deviations. One assesses the quality of the change itself: scope discipline, edit patterns, structural consistency. ^ⓘ

A family of perspectives, deliberately composed so that no single one dominates the verdict. None of them are allowed to fix what they find — they only report. ^ⓘ

The reason VERIFY is its own phase is the same reason a compiler is not the same process as the programmer. Self-verification is biased. The hand that built the code wants to see the code as correct. A separate phase, with separate tools, run by a separate cognitive posture — and frequently delegated to subagents for independence — gives the verification an honest chance to catch what execution missed.

The outcomes

VERIFY produces a structured pass/fail report against the plan’s acceptance criteria. The report goes into the plan document and into the working CLAUDE.md. ^ⓘ

The outcomes split into kinds: routing outcomes (forward to CONDENSE; backward to a prior phase for a small fix or a deeper reset) and the plan-state-flip outcome — a state change on the plan itself, requested from the user and recorded on the job. The flip is not the end of the plan; it just promotes its stage. The plan’s actual ending is a separate signal — seal-plan — fired later, when the plan has nothing left to teach. ^ⓘ

If everything passes, the orchestrator advances the job to CONDENSE.

If something minor fails — a typo, a missed import, a comment that wasn’t updated — VERIFY transitions backward to EXECUTE, which gets a chance to fix the small thing and re-checkpoint. The plan is not amended; only the implementation is corrected. ^ⓘ

If something major fails — an acceptance criterion that the plan can’t meet, a discovery that the observation was incomplete — VERIFY transitions backward all the way to OBSERVE or PLAN. The orchestrator records the rollback as part of the job’s history. The cycle restarts with the lesson learned. ^ⓘ

If the plan is mature, VERIFY asks the user — AskUserQuestion with a [PLAN-APPROVAL] or [YAML-APPROVAL] prefix, gated to fire only from this phase. On a "yes," VERIFY calls plan.sh approve-md or plan.sh approve-yaml, plan_state flips forward, and the cycle advances to CONDENSE the same way a clean pass does. The state change is recorded on the plan, not on the cycle’s verdict — the plan stays alive, promoted to its next stage. Only seal-plan, fired later when the plan has nothing left to teach, actually retires it. ^ⓘ

Situational edges and the Markov property

The backward edges are situational rather than a fixed menu — VERIFY rolls back to whichever prior phase the failure actually points at, and the approval-flip outcome is not a routing at all but a state change on the plan — and it is this discipline of choosing where to go, edge by edge, that gives the phasic layer its name.

Forward transitions are automatic when the gate criteria are met; backward transitions are explicit and the agent has to choose where to roll back to. The state of the cycle is fully captured in the orchestrator’s data file — current phase, cycle number, multiplier, point counter, and a few transition flags (pre-gmode stash, suppress-increment, forwarded). No hidden continuation. ^ⓘ

Any phase can be re-entered, but only by rolling back along defined edges. This is the Markov property the title leans on: the cycle’s next move is a function of its present state, not of the path that got it there. ^ⓘ

Image pending — VERIFY’s three backward edges and one forward edge

Prompt: ASSET: images/verify-backward-edges-b6-6.png Style: Match opevc-cycle-blackboard.png exactly. Dark slate chalkboard; hand-drawn chalk lines; pastel chalk (cyan, green, orange, pink, magenta — same palette as the cycle image) for the phase-node circles; white chalk for ALL labels and arrows; chalk sticks at the bottom edge; faint chalk dust at the edges. IMPORTANT: Use only the literal names listed below. Do not invent or substitute any other names, labels, edge descriptors, or phase descriptors. Layout: One central pink chalk circle in the middle of the board labeled "verify". To the left of verify, three chalk arrows fan outward and land on three target circles: - short backward arrow → orange circle labeled "execute", small chalk note above the arrow: "minor fix" - mid-length backward arrow → green circle labeled "plan", chalk note above the arrow: "design flaw" - long backward arrow → cyan circle labeled "observe", chalk note above the arrow: "context gap" To the right of verify, one forward arrow → magenta circle labeled "condense", chalk note above the arrow: "pass". All four arrows are white chalk, slightly curved. Keep every line hand-drawn and slightly imperfect, never ruler-straight. STRICT NAME WHITELIST — the image must contain only these literal text strings as labels: "verify", "execute", "plan", "observe", "condense", "minor fix", "design flaw", "context gap", "pass", plus the caption below. No other words, file names, folders, or phase descriptors may appear.

Image 6.6. Backward transitions are explicit choices, not automatic fallbacks.

When VERIFY passes, the orchestrator advances the job to CONDENSE.

A worked example

The multi-cycle plan-job from the prior sub-essays reaches its third cycle’s VERIFY phase. The agent has revert-and-reauthored the marker schema in EXECUTE; the checkpoint commits are clean; the orchestrator advances the job to VERIFY.

The agent enters VERIFY. The multiplier sentinel is unset; every tool is locked. The agent forecasts a tighter phase than EXECUTE because the work is read-and-judge rather than write — picks 2.5 on the multiplier. The lock lifts. ^ⓘ

The agent dispatches the auditor family in parallel. The verify-execute-auditor reads the cycle’s commit graph and the altered-list diff. The verify-plan-auditor re-reads the .md plan and compares its acceptance criteria against the actual code change. The verify-observe-auditor walks the OBSERVE synthesis the cycle inherited from cycle 3's observation. The verify-git-historian audits the checkpoint cadence. None of them is allowed to fix what they find. ^ⓘ

The verify-plan-auditor surfaces a gap. The cycle’s acceptance criteria, written by PLAN, list a round-trip check for the new marker schema. The agent ran that check and it passed. But the auditor also notices the .md plan never declared the criterion for backward-compatibility with cycle-2 cycle records — code already on disk, marker schema is now the new shape, the old records can no longer be parsed. The acceptance criteria the cycle inherited were under-specified. Execution did exactly what was asked; the asking was incomplete. ^ⓘ

The agent has a decision. The judgment-call criterion PLAN named — "if revert breaks downstream cycle-2 work, route a [PENDING-JOB] for CONDENSE" — partially covers this, but the under-specification is upstream of execute: PLAN missed it, not EXECUTE. The agent picks the major-fix path. It writes the auditor’s finding into the working CLAUDE.md, below ---Ve---; it edits the .md plan to add the backward-compat criterion (allowed — plan_state is drafting, file is the focused plan_file); it commits the verify checkpoint with --backward plan; the orchestrator routes the cycle back to PLAN. ^ⓘ

The cycle does not lose its history. The rollback is recorded. PLAN inherits the sharpened criteria; the agent re-plans the backward-compat work alongside the marker schema work; the cycle continues with a tighter contract.

What you would customize

VERIFY is the cycle’s last word, and the architect has surfaces to bend.

The architect would tune the auditor roster. The prototype ships an initial set — observe-auditor, plan-auditor, execute-auditor, git-historian, code-evolution-tracker — calibrated to seed-agent work where the artifacts are code and plans. A legal-research seed would swap in citation-checkers and precedent-comparison auditors. A finance-analysis seed would swap in number-reconciliation and filing-cross-check auditors. The auditor family is the surface; the perspectives it composes are yours. ^ⓘ

The architect would tune the approval-question wording. [PLAN-APPROVAL] and [YAML-APPROVAL] are the prototype’s gates because the prototype’s plans go through a two-stage .md-then-.yaml lifecycle. A simpler seed might collapse approval to a single prefix; a richer one might add [DRAFT-APPROVAL], [REVIEW-APPROVAL], and [RELEASE-APPROVAL] for staged sign-offs. The question-discipline registry takes the new prefixes; the verify-side handler maps them to the right plan_state transition. ^ⓘ

The architect would tune the plan-edit scope rules. The current scope is tight — VERIFY can only edit the focused job’s plan_file and only when its state allows. A seed wanting looser refinement could widen the scope to any plan in drafting. A seed wanting tighter discipline could add per-section locks (acceptance criteria are append-only; goal is immutable once cycle 1 commits). ^ⓘ

What the architect would not customize is the inability to edit project source. The principle is the floor: a verification phase that lets the hand that built the code also rewrite it is not verification, it is hand-washing.

A consulting practice could install the same separation. The engagement-review agent reads the deliverable, scores it against the proposal’s acceptance criteria, asks the partner for sign-off via a structured question — but cannot edit the deliverable itself. Only the build-side agent can act on the review. The honest design-limit is worth naming: the auditor’s verdict is LLM-interpreted judgment, not mathematics. Backward-versus-forward routing depends on the agent honestly classifying severity; the user-approval gate depends on the user actually reading. The structural separation is the enforcement substrate, not a certainty guarantee — and gmode is the documented operator escape when the right move is to skip the gate. The phase makes self-validation harder; the human is still on the hook for what gets shipped. ^ⓘ

The deepest payoff of VERIFY is the cognitive failure mode it prevents: the self-validating delivery — the agent that ships work it just built, declares it correct because it has read the diff, and the read-of-its-own-output is the same cognitive posture as the write-of-the-output. Same LLM, same prior, same blind spots. The structural separation — a phase with different tools, a different scope, frequently delegated to subagents with no execution history — gives the verdict an honest chance to be wrong about itself. The friction is the pedagogy. The phase is the compartment.

VERIFY’s verdict closes the work-on-the-project arc. What CONDENSE does next is different in kind — work on the brain itself. Next.