Architecture

The principles and systems that power every workflow.

Why structured workflows beat vibe coding

"Vibe coding" — giving an AI agent a loose prompt and hoping it figures things out — works for throwaway scripts. It falls apart for production code. Here's why:

Vibe coding skips discovery

An ad-hoc prompt jumps straight to code. No blast radius analysis, no existing test inventory, no knowledge base lookup. The agent doesn't know what it doesn't know — and neither do you until the PR breaks something.

Workflows enforce discovery first

5-7 discovery phases run before a single line of code changes. The agent maps the blast radius, checks the knowledge base, inventories existing tests, and defines requirements — THEN codes.

Vibe coding produces no tests

Ask an AI to "fix this bug" and you get a code change. No unit tests, no E2E tests, no regression check. You're shipping untested code and calling it "AI-assisted development."

Workflows auto-generate all test layers

TEST_GEN auto-writes unit + E2E tests. BASELINE captures pre-change health. VERIFY runs the new tests. BLAST_RADIUS_RUN catches regressions. Four mandatory test phases — none skippable.

Vibe coding has no memory

Every prompt starts from zero. The agent doesn't know that this module has a quirk, that this API times out under load, or that the last three people who touched this file introduced the same regression.

Workflows learn and remember

RECALL reads KNOWLEDGE_BASE.md before discovery begins. LEARN writes back after every workflow. CODEBASE_INSIGHTS.md captures architecture patterns. The system gets smarter with every run.

Vibe coding is untraceable

When the PR is reviewed, there's no evidence trail. Why was this approach chosen? What alternatives were considered? What tests validate it? The reviewer has to trust the AI — or re-investigate everything themselves.

Workflows produce a document chain

16+ linked documents from INTAKE to CLOSE. Every decision traces back to evidence. The PR reviewer can follow DIAGNOSIS → FIX_PLAN → TEST_GEN → VERIFY and understand exactly why every change was made.

The framework doesn't slow down AI-assisted development — it makes it trustworthy. Same speed, but with evidence, tests, and traceability that production code demands.

1. Orchestrator Loop

The orchestrator implements a state machine loop that drives the agent through each phase until a terminal signal is emitted or the agent is blocked.

Load RULES.md
Read STATE_FILE — current phase, subphase, attempt counter
Load phase prompt
Execute phase — reads upstream docs, produces output
Process signal
Update STATE_FILE
Route — advance, loopback, block, or complete

2. Ralph Loops

Idempotent retry pattern that makes every phase safe to re-execute.

Idempotency Check

If output doc exists and is complete, emit PHASE_COMPLETE without re-doing work.

Attempt Counter

On failure: increment counter, rewrite state, re-run. At attempt >= 3: emit BLOCKED_NEEDS_HUMAN.

Loopback Signals

When later phases discover earlier analysis was wrong: REDIAGNOSE REDESIGN REARCHITECT.

State Rewrite

Always full cat > STATE_FILE << 'EOF' replacement. Never sed/edit.

bash

# Example state rewrite — always atomic, never incremental
cat > .workflow/STATE_FILE.md << 'EOF'
phase: DIAGNOSIS
subphase: root_cause_analysis
attempt: 2
completed_phases:
  - INTAKE: 2026-02-27T10:00:00Z
  - REPRODUCE: 2026-02-27T10:05:00Z
  - RECALL: 2026-02-27T10:06:00Z
  - CODE_TRACE: 2026-02-27T10:15:00Z
auto_pr: true
EOF

3. LISA — Layered Information State Architecture

Information organized into three layers with different lifecycles and scopes.

State File

Current phase, subphase, attempt counter, completed phases with timestamps. Single file, rewritten in full on every transition.

Phase Documents

One document per phase. Each reads upstream docs and produces exactly one output. Once written, immutable.

Knowledge Stores

Three shared files that persist across all workflow runs and grow smarter over time:

CODEBASE_INSIGHTS.md

The enterprise context document for all workflow agents. Deeply investigated (not surface-scanned) from the survey — 10 sections purpose-built for what agents need: architecture with versions and build details, every module/component/hook listed by name, complete initialization chain in execution order, test config with path mappings and commands, dependency graph with blast-radius rules, override/resolution mechanism, and 8–12 real import examples. Quality-validated by the VALIDATE phase. Adapts to any tech stack. Read by RECALL (all workflows). Written by LEARN (appends fragile areas, component consumers, API quirks). Refreshed by /refresh-insights (re-surveys repo and regenerates without a full generator run).

KNOWLEDGE_BASE.md

Starts empty. Populated by LEARN phases with reusable patterns — diagnostic approaches that worked, architecture decisions, slice strategies, known anti-patterns. Read by RECALL, DIAGNOSIS, DESIGN, ARCHITECTURE.

TOOL_RETRO.md

Seeded by the generator with known-working commands from the survey (dev server, test runners, build checks). LEARN appends commands that failed and their workarounds. Read by RECALL so downstream phases never retry known-broken commands.

Compounding effect

Run 1 starts with survey-seeded insights. LEARN writes new discoveries. Run 2 reads survey + Run 1 findings and skips known areas. After 5-10 runs, the knowledge stores contain deep institutional knowledge — fragile modules, API quirks, test gaps — that no single engineer carries in their head. If your codebase changes significantly (new modules, updated deps), run /refresh-insights to re-survey without re-running the full generator.

4. Signal-Based Routing

Every phase emits exactly one signal. Four types: advance, block, loopback, terminal.

Signal	Type	Meaning
PHASE_COMPLETE	advance	Phase finished, proceed to next
BLOCKED_NEEDS_HUMAN	block	Cannot proceed, human intervention required
SCOPE_ESCALATION	block	Discovered scope larger than expected
REDIAGNOSE	loopback	Root cause wrong, retry DIAGNOSIS (bug-fix)
REDESIGN	loopback	Design failed, retry DESIGN (feature-enhance)
REARCHITECT	loopback	Architecture invalid, retry (feature-build)
RESLICE	loopback	Slice plan needs restructuring (feature-build)
DEPENDENCY_BLOCKED	block	External dependency unavailable (feature-build)
BUG_FIXED	terminal	Bug-fix workflow complete
ENHANCEMENT_SHIPPED	terminal	Feature-enhance workflow complete
FEATURE_SHIPPED	terminal	Feature-build workflow complete

5. Document Chain

Upstream phases produce documents that downstream phases read, creating a traceable path from problem to solution.

6. Test Architecture

Tests categorized into three groups with distinct ownership and execution timing.

New Tests

Written in TEST_GEN for new behavior. Run in IMPLEMENT and VERIFY.

Updated Tests

Existing tests with changed assertions. Run in IMPLEMENT and VERIFY.

Existing Tests

Unchanged regression tests. Run in BASELINE and BLAST_RADIUS.

Phase	Purpose	Categories
BASELINE	Capture health before changes	C
IMPLEMENT	Red-green cycle	A, B
VERIFY	Confirm fix/feature end-to-end	A, B
BLAST_RADIUS	Check for regressions	C

7. Universal Rules

Rules loaded from RULES.md at the start of every orchestrator iteration. Development workflows use Rules 1-17. Story readiness workflows add Rules 17s and 18.

1Verify Every Action▶

After creating/modifying any file: verify with ls and cat. After code changes: run build check. Never assume a write succeeded without reading back.

2No Vague Language▶

BANNED: "verified", "confirmed", "works correctly" without showing output. REQUIRED: show command, show output, state conclusion.

3Read Human Guidance First▶

Every phase starts by reading HUMAN_GUIDANCE.md. Human-provided overrides take precedence over automated analysis.

4Rewrite State, Don’t Edit▶

Always full rewrite via cat >. Never sed -i on state file. Prevents corruption from partial writes.

5Blocker at 3 Attempts▶

3 failures on same phase → write BLOCKERS.md → emit BLOCKED_NEEDS_HUMAN → STOP.

6Check Completion Before Work▶

If phase already done (grep state file), skip to next. Makes phases idempotent.

7Stay Scoped▶

Only read docs listed in prompt's "Reads" section.

8Mandatory Phase Order▶

Never skip: VERIFY → BLAST_RADIUS → CLOSE → LEARN.

9Both Test Layers Always▶

BASELINE, IMPLEMENT, VERIFY, BLAST_RADIUS must run both unit and E2E tests.

10Scope Defined by Requirements▶

The requirements document is the single source of truth for scope decisions.

11Recall Before Discovery▶

RECALL runs once, produces RECALL.md used by downstream phases.

12App Lifecycle for E2E▶

Start dev server, wait for ready, run tests, stop server. Never assume the server is already running.

13Feature Flag Gate▶

New features behind flag (default OFF). Tested in both states.

13bThree Test Categories▶

Category A: new tests. Category B: updated tests. Category C: existing regression tests.

14Design Context Non-Blocking▶

Missing Figma/design tool never blocks workflow.

15Dependency Decisions Explicit▶

Every dependency: READY, MOCKED, or BLOCKED.

16Real Timestamps▶

Execute date command, never invent timestamps.

17Ticket Gate▶

TICKET_GATE phase runs after INTAKE when a Jira ticket is provided. Workflow-specific checklists: 6 required for bug-fix, 6 for enhance, 8 required + 5 recommended for feature. For features: fetches each linked story to assess dependency maturity — checks status (Done vs In Progress), issue type (Spike = unvalidated design), interface contract presence, and transitive blockers. Blocks with downstream impact analysis if insufficient. Auto-passes if no Jira ticket.

17sRead All Linked Confluence Pages (Story)▶

When a Jira ticket description contains Confluence links — smartlinks, direct URLs, or prefixed references (PRD:, Spec:, Design:) — extract page IDs and fetch each via getConfluencePage. If a page has relevant child pages, also fetch descendants. If no links found but ticket mentions a PRD, search Confluence. Never skip reads — log failures but attempt all pages. Applies to story readiness workflows.

18PRD Traceability (Story)▶

When PRD_REQUIREMENTS.md exists: every AC must include a Traces to: annotation (FR-N, NFR-N, or [ASSUMED] with rationale). Every functional requirement must be covered by an AC, explicitly out-of-scope, or flagged as a gap. Stories with <50% PRD coverage and no rationale fail DoR validation. Applies to story readiness workflows.

8. Integrations

External tools that enrich context and enforce quality gates. All designed to degrade gracefully.

Ticket Gate

Dedicated TICKET_GATE phase after INTAKE. Workflow-specific checklists (6/6/8 required items). For features: fetches linked stories to check dependency maturity, interface contracts, and transitive blockers. Three verdicts: READY, NEEDS_REVIEW, INSUFFICIENT.

Figma

Non-blocking. Unavailable design context proceeds with design_available: false.

Feature Flags

Feature-build workflow: all code behind flags (default OFF). Feature-enhance: flags conditional on category and risk.