Skip to main content
OrchestKit v7.43.0 — 104 skills, 36 agents, 173 hooks · Claude Code 2.1.105+
OrchestKit
Skills

Implement

Full-power feature implementation with parallel subagents. Use when implementing, building, or creating features.

Command medium
Invoke
/ork:implement

Implement Feature

Parallel subagent execution for feature implementation with scope control and reflection.

Quick Start

/ork:implement user authentication
/ork:implement --model=opus real-time notifications
/ork:implement dashboard analytics

Argument Resolution

FEATURE_DESC = "$ARGUMENTS"  # Full argument string, e.g., "user authentication"
# $ARGUMENTS[0] is the first token, $ARGUMENTS[1] second, etc. (CC 2.1.59)

# Model override detection (CC 2.1.72)
MODEL_OVERRIDE = None
for token in "$ARGUMENTS".split():
    if token.startswith("--model="):
        MODEL_OVERRIDE = token.split("=", 1)[1]  # "opus", "sonnet", "haiku"
        FEATURE_DESC = FEATURE_DESC.replace(token, "").strip()

Pass MODEL_OVERRIDE to all Agent() calls via model=MODEL_OVERRIDE when set. Accepts symbolic names (opus, sonnet, haiku) or full IDs (claude-opus-4-6) per CC 2.1.74.


Step -1: MCP Probe + Resume Check

Run BEFORE any other step. Detect available MCP servers and check for resumable state.

# Probe MCPs (parallel — all in ONE message):
ToolSearch(query="select:mcp__memory__search_nodes")
ToolSearch(query="select:mcp__context7__resolve-library-id")

Write(".claude/chain/capabilities.json", JSON.stringify({
  "memory": <true if found>,
  "context7": <true if found>,
  "timestamp": now()
}))

# Resume check:
Read(".claude/chain/state.json")
# If exists and skill == "implement":
#   Read last handoff (e.g., 04-architecture.json)
#   Skip to current_phase
#   "Resuming from Phase {N} — architecture decided in previous session"
# If not: write initial state
Write(".claude/chain/state.json", JSON.stringify({
  "skill": "implement", "feature": FEATURE_DESC,
  "current_phase": 1, "completed_phases": [],
  "capabilities": capabilities
}))

Load: Read("$\{CLAUDE_PLUGIN_ROOT\}/skills/chain-patterns/references/checkpoint-resume.md")


Step 0: Effort-Aware Phase Scaling (CC 2.1.76)

Read the /effort setting to scale implementation depth. The effort-aware context budgeting hook detects effort level automatically — adapt the phase plan accordingly:

Effort LevelPhases RunAgentsToken Budget
low1 (Discovery) → 5 (Implement) → 10 (Reflect)2 max~50K
medium1 → 2 → 5 → 7 (Scope Creep) → 103 max~150K
high (default)All 10 phases4-7~400K

Override: Explicit user selection in Step 0 (e.g., "Plan first" or "Worktree") overrides /effort downscaling. If user requests full exploration, respect that regardless of effort level.

Step 0a: Project Context Discovery

BEFORE any work, detect the project tier. This becomes the complexity ceiling for all patterns.

Scan codebase signals and classify into tiers 1-6 (Interview through Open Source). Each tier sets an architecture ceiling and determines which phases/agents to use.

Load tier details, workflow mapping, and orchestration mode: Read("$\{CLAUDE_SKILL_DIR\}/references/tier-classification.md")

Worktree Isolation (CC 2.1.49)

For features touching 5+ files, offer worktree isolation to prevent conflicts with the main working tree:

AskUserQuestion(questions=[{
  "question": "Isolate this feature in a git worktree?",
  "header": "Isolation",
  "options": [
    {"label": "Yes — worktree (Recommended)", "description": "Creates isolated branch via EnterWorktree, merges back on completion", "markdown": "```\nWorktree Isolation\n──────────────────\nmain ─────────────────────────────▶\n  \\                              /\n   └─ feat-{slug} (worktree) ───┘\n      ├── Isolated directory\n      ├── Own branch + index\n      └── Auto-merge on completion\n\nSafe: main stays untouched until done\n```"},
    {"label": "No — work in-place", "description": "Edit files directly in current branch", "markdown": "```\nIn-Place Editing\n────────────────\nmain ──[edit]──[edit]──[edit]───▶\n       ▲       ▲       ▲\n       │       │       │\n     direct modifications\n\nFast: no branch overhead\nRisk: changes visible immediately\n```"},
    {"label": "Plan first", "description": "Research and design in plan mode before writing code", "markdown": "```\nPlan Mode Flow\n──────────────\n  1. EnterPlanMode($ARGUMENTS)\n  2. Read existing code\n  3. Research patterns\n  4. Design approach\n  5. ExitPlanMode → plan\n  6. User approves plan\n  7. Execute implementation\n\n  Best for: Large features,\n  unfamiliar codebases,\n  architectural decisions\n```"}
  ],
  "multiSelect": false
}])

If 'Plan first' selected:

# 1. Enter read-only plan mode
EnterPlanMode("Research and design: $ARGUMENTS")

# 2. Research phase — Read/Grep/Glob ONLY, no Write/Edit
#    - Read existing code in the target area
#    - Grep for related patterns, imports, dependencies
#    - Check tests, configs, and integration points
#    - If context7 available: query library docs

# 3. Design the plan — produce:
#    - File map: which files to create/modify
#    - Architecture decisions with rationale
#    - Task breakdown with acceptance criteria
#    - Risk assessment and edge cases

# 4. Exit plan mode — returns plan to user for approval
ExitPlanMode()

# 5. User reviews plan. If approved → continue to Phase 1 (Discovery)
#    with the plan as input. If rejected → revise or stop.

If worktree selected:

  1. Call EnterWorktree(name: "feat-\{slug\}") to create isolated branch
  2. All agents work in the worktree directory
  3. On completion, merge back: git checkout \{original-branch\} && git merge feat-\{slug\}
  4. If merge conflicts arise, present diff to user via AskUserQuestion

Load worktree details: Read("$\{CLAUDE_SKILL_DIR\}/references/worktree-isolation-mode.md")


Task Management (MANDATORY)

BEFORE doing ANYTHING else, create tasks to track progress:

# 1. Create main task IMMEDIATELY
TaskCreate(
  subject="Implement: {feature}",
  description="Feature implementation with parallel subagents",
  activeForm="Implementing {feature}"
)

# 2. Create subtasks for each phase
TaskCreate(subject="Research best practices and docs", activeForm="Researching best practices")  # id=2
TaskCreate(subject="Micro-plan: scope, files, criteria", activeForm="Micro-planning")            # id=3
TaskCreate(subject="Architecture design (parallel agents)", activeForm="Designing architecture") # id=4
TaskCreate(subject="Implement and write tests", activeForm="Implementing code")                  # id=5
TaskCreate(subject="Integration verification", activeForm="Verifying integration")               # id=6
TaskCreate(subject="Scope creep check", activeForm="Checking scope creep")                       # id=7
TaskCreate(subject="E2E verification", activeForm="Running E2E verification")                    # id=8
TaskCreate(subject="Document and reflect", activeForm="Documenting decisions")                   # id=9

# 3. Set dependencies for sequential phases
TaskUpdate(taskId="3", addBlockedBy=["2"])  # Plan needs research
TaskUpdate(taskId="4", addBlockedBy=["3"])  # Architecture needs plan
TaskUpdate(taskId="5", addBlockedBy=["4"])  # Implementation needs architecture
TaskUpdate(taskId="6", addBlockedBy=["5"])  # Integration needs implementation
TaskUpdate(taskId="7", addBlockedBy=["6"])  # Scope creep needs integration
TaskUpdate(taskId="8", addBlockedBy=["7"])  # E2E needs scope check
TaskUpdate(taskId="9", addBlockedBy=["8"])  # Docs need E2E

# 4. Before starting each task, verify it's unblocked
task = TaskGet(taskId="2")  # Verify blockedBy is empty

# 5. Update status as you progress
TaskUpdate(taskId="2", status="in_progress")  # When starting
TaskUpdate(taskId="2", status="completed")    # When done — repeat for each subtask

Workflow (10 Phases)

PhaseActivitiesAgents
1. DiscoveryResearch best practices, Context7 docs, break into tasks
2. Micro-PlanningDetailed plan per task (load $\{CLAUDE_SKILL_DIR\}/references/micro-planning-guide.md)
3. WorktreeIsolate in git worktree for 5+ file features (load $\{CLAUDE_SKILL_DIR\}/references/worktree-workflow.md)
4. Architecture4 parallel background agentsworkflow-architect, backend-system-architect, frontend-ui-developer, llm-integrator
5. Implementation + TestsParallel agents, single-pass artifacts with mandatory testsbackend-system-architect, frontend-ui-developer, llm-integrator, test-generator
6. Integration VerificationCode review + real-service integration testsbackend, frontend, code-quality-reviewer, security-auditor
7. Scope CreepCompare planned vs actual (load $\{CLAUDE_SKILL_DIR\}/references/scope-creep-detection.md)workflow-architect
8. E2E VerificationBrowser + API E2E testing (load $\{CLAUDE_SKILL_DIR\}/references/e2e-verification.md)
9. DocumentationSave decisions to memory graph
10. ReflectionLessons learned, estimation accuracyworkflow-architect

Load agent prompts: Read("$\{CLAUDE_SKILL_DIR\}/references/agent-phases.md")

For Agent Teams mode: Read("$\{CLAUDE_SKILL_DIR\}/references/agent-teams-phases.md")

Phase Handoffs (CC 2.1.71)

Write handoff JSON after major phases. See chain-patterns skill for schema.

After PhaseHandoff FileKey Outputs
1. Discovery01-discovery.jsonBest practices, library docs, task breakdown
2. Micro-Plan02-plan.jsonFile map, acceptance criteria per task
4. Architecture04-architecture.jsonDecisions, patterns chosen, agent results
5. Implementation05-implementation.jsonFiles created/modified, test results
7. Scope Creep07-scope.jsonPlanned vs actual, PR split recommendation

Progressive Output (CC 2.1.76+)

Output results incrementally after each phase — don't batch everything until the end.

Focus mode (CC 2.1.101): In focus mode (Ctrl+O), the user only sees your final message. Include a self-contained summary with all key results — don't assume they saw incremental outputs.

After PhaseShow User
1. DiscoveryKey findings, library recommendations, task breakdown
4. ArchitectureEach agent's design decisions as they return
5. ImplementationFiles created/modified per agent, test results
7. Scope CreepPlanned vs actual delta, PR split recommendation

When agents run with run_in_background=true, output each agent's findings as soon as it returns — don't wait for all agents to finish. This gives users ~60% faster perceived feedback and enables early intervention if an agent's approach diverges from the plan.

Monitor Tool for Background Streaming (CC 2.1.98)

Use Monitor to stream real-time events from background build/test scripts instead of polling output files:

# Start a long-running build in background
Bash(command="npm run build 2>&1", run_in_background=true)

# Stream its output line-by-line as notifications
Monitor(pid=build_task_id)
# Each stdout line arrives as a notification — no polling needed

# For background agents with test suites:
Agent(subagent_type="test-generator", run_in_background=true, ...)
# Monitor agent progress via task notifications (CC 2.1.98 partial progress)

Partial results (CC 2.1.98): Background agents that fail now report partial progress to the parent. If a worktree-isolated agent crashes mid-implementation, synthesize its partial output instead of re-spawning:

# After collecting agent results:
for agent_result in agent_results:
    if "[PARTIAL RESULT]" in agent_result.output:
        # Agent crashed mid-work — salvage what it produced
        partial_files = Bash(command="git diff --name-only", cwd=agent_result.worktree)
        if partial_files:
            # Merge partial work — commit what's usable, flag incomplete items
            TaskUpdate(taskId=agent_task_id, status="completed",
                       description=f"Partial: {len(partial_files)} files from crashed agent")
        # Do NOT re-spawn — partial progress > wasted tokens re-doing work
    elif agent_result.status == "BLOCKED":
        # Agent hit a genuine blocker — escalate to user
        TaskUpdate(taskId=agent_task_id, status="in_progress",
                   description=f"BLOCKED: {agent_result.concerns[0]}")

Worktree-Isolated Implementation (CC 2.1.50)

Phase 5 agents SHOULD use isolation: "worktree" to prevent file conflicts:

Agent(subagent_type="backend-system-architect",
  prompt="Implement backend: {feature}. Architecture: {from 04-architecture.json}",
  isolation="worktree", run_in_background=true)
Agent(subagent_type="frontend-ui-developer",
  prompt="Implement frontend: {feature}...",
  isolation="worktree", run_in_background=true)
Agent(subagent_type="test-generator",
  prompt="Generate tests: {feature}...",
  isolation="worktree", run_in_background=true)

Post-Deploy Monitoring (CC 2.1.71)

After final PR, schedule health monitoring:

# Guard: Skip cron in headless/CI (CLAUDE_CODE_DISABLE_CRON)
# if env CLAUDE_CODE_DISABLE_CRON is set, run a single check instead
CronCreate(
  schedule="0 */6 * * *",
  prompt="Health check for {feature} in PR #{pr}:
    gh pr checks {pr} --repo {repo}.
    If healthy 24h → CronDelete. If errors → alert."
)

context7 with Detection

if capabilities.context7:
  mcp__context7__resolve-library-id({ libraryName: "next-auth" })
  mcp__context7__query-docs({ libraryId: "...", query: "..." })
else:
  WebFetch("https://docs.example.com/api")  # T1 fallback

Issue Tracking

If working on a GitHub issue, run the Start Work ceremony from issue-progress-tracking and post progress comments after major phases.

Feedback Loop

Maintain checkpoints after each task. Load triggers: Read("$\{CLAUDE_SKILL_DIR\}/references/feedback-loop.md")


Test Requirements Matrix

Phase 5 test-generator MUST produce tests matching the change type. Each change type maps to specific required tests and testing rules.

Load test matrix, real-service detection, and phase 9 gate: Read("$\{CLAUDE_SKILL_DIR\}/references/test-requirements-matrix.md")


Key Principles

  • Verification gate — before claiming ANY task done, apply the 5-step gate: Read("$\{CLAUDE_PLUGIN_ROOT\}/skills/shared/rules/verification-gate.md"). "Should work now" is not evidence.
  • Agent status protocol — all subagents report DONE / DONE_WITH_CONCERNS / BLOCKED / NEEDS_CONTEXT per Read("$\{CLAUDE_PLUGIN_ROOT\}/agents/shared/status-protocol.md")
  • Tests are NOT optional — each task includes its tests, matched to change type (see matrix above)
  • Parallel when independent — use run_in_background: true, launch all agents in ONE message
  • Output limits (CC 2.1.77+): Opus 4.6 defaults to 64k output tokens (128k upper bound). Generate complete artifacts in a single pass when possible; chunk across turns if output exceeds the limit
  • Micro-plan before implementing — scope boundaries, file list, acceptance criteria
  • Detect scope creep (phase 7) — score 0-10, split PR if significant
  • Real services when available — if docker-compose/testcontainers exist, use them in Phase 6
  • Reflect and capture lessons (phase 10) — persist to memory graph
  • Clean up agents — use TeamDelete() after completion; press Ctrl+F twice as manual fallback. Note: /clear (CC 2.1.72+) preserves background agents
  • Exit worktrees — call ExitWorktree(action: "keep") in Phase 10 if worktree was entered in Step 0; never leave orphaned worktrees

Next Steps (suggest to user after implementation)

/ork:verify {FEATURE}              # Grade the implementation
/ork:cover {FEATURE}               # Generate test suite
/ork:commit                        # Commit changes
/loop 10m npm test                 # Watch tests while iterating
/loop 30m /ork:verify {FEATURE}    # Periodic quality gate

Agent Coordination

Context Passing

All spawned agents receive: changed files list, project tier, architectural constraints, and decisions from prior phases (discovery, plan). Pass via the agent prompt, not just "implement X".

SendMessage (Active Coordination)

When backend and frontend agents need to align on API contracts:

SendMessage(to="frontend-ui-developer", message="API endpoint is POST /api/auth with {token, refreshToken} response shape")
SendMessage(to="test-generator", message="Backend uses JWT — mock auth middleware in test fixtures")

Skill Chain

After implementation completes, chain to verification:

TaskCreate(subject="Verify implementation", activeForm="Verifying changes", addBlockedBy=[impl_task_id])
# Then: /ork:verify {feature}
  • ork:explore: Explore codebase before implementing
  • ork:verify: Verify implementations work correctly
  • ork:issue-progress-tracking: Auto-updates GitHub issues with commit progress

References

Load on demand with Read("$\{CLAUDE_SKILL_DIR\}/references/&lt;file&gt;"):

FileContent
agent-phases.mdAgent prompts and spawn templates
agent-teams-phases.mdAgent Teams mode phases
interview-mode.mdInterview/take-home constraints
orchestration-modes.mdTask tool vs Agent Teams selection
feedback-loop.mdCheckpoint triggers and actions
cc-enhancements.mdCC version-specific features
agent-teams-full-stack.mdFull-stack pipeline for teams
team-worktree-setup.mdTeam worktree configuration
micro-planning-guide.mdDetailed micro-planning guide
scope-creep-detection.mdPlanned vs actual comparison
worktree-workflow.mdGit worktree workflow
e2e-verification.mdBrowser + API E2E testing guide
worktree-isolation-mode.mdWorktree isolation details
tier-classification.mdTier classification, workflow mapping, orchestration mode
test-requirements-matrix.mdTest matrix by change type, real-service detection, phase 9 gate

Rules (5)

Subagents must only modify files within their assigned scope — prevent cross-agent conflicts — HIGH

Agent Scope Containment

Each subagent spawned in Phase 4-5 must receive an explicit file scope boundary in its prompt. Agents must NOT modify files outside their assigned scope. This prevents parallel agents from overwriting each other's work.

Problem

Claude spawns parallel agents (backend, frontend, test-generator) without defining which files each agent owns. Two agents edit the same file — the last one to finish silently overwrites the first. This is especially dangerous without worktree isolation.

Scope Assignment

Define non-overlapping scopes in Phase 2 (Micro-Planning):

scopes = {
    "backend-system-architect": {
        "owns": ["src/api/", "src/services/", "src/models/"],
        "reads": ["src/types/", "src/config/"],
        "forbidden": ["src/components/", "src/pages/", "tests/"]
    },
    "frontend-ui-developer": {
        "owns": ["src/components/", "src/pages/", "src/hooks/"],
        "reads": ["src/types/", "src/api/client.ts"],
        "forbidden": ["src/api/routes/", "src/services/", "src/models/"]
    },
    "test-generator": {
        "owns": ["tests/", "__tests__/", "*.test.*", "*.spec.*"],
        "reads": ["src/"],
        "forbidden": []  # Can read everything, writes only to test dirs
    }
}

Incorrect — no scope boundaries in agent prompts:

# Phase 5: Agents with overlapping scope
Agent(subagent_type="backend-system-architect",
  prompt="Implement the user auth feature",
  run_in_background=True)
Agent(subagent_type="frontend-ui-developer",
  prompt="Implement the user auth feature",
  run_in_background=True)
# Both agents edit src/types/user.ts — last write wins, first is lost
# Both create src/utils/validation.ts — silent overwrite

Correct — explicit scope in every agent prompt:

# Phase 5: Agents with strict scope boundaries
Agent(subagent_type="backend-system-architect",
  prompt="""Implement backend for user auth.
  YOUR SCOPE (only modify these): src/api/, src/services/, src/models/
  SHARED TYPES (read-only): src/types/
  DO NOT TOUCH: src/components/, src/pages/, tests/
  If you need a shared type, create it in src/types/auth.types.ts""",
  run_in_background=True)

Agent(subagent_type="frontend-ui-developer",
  prompt="""Implement frontend for user auth.
  YOUR SCOPE (only modify these): src/components/, src/pages/, src/hooks/
  SHARED TYPES (read-only): src/types/
  DO NOT TOUCH: src/api/routes/, src/services/, src/models/
  Import API client from src/api/client.ts — do not modify it""",
  run_in_background=True)

Shared File Protocol

For files both agents need (e.g., shared types):

  1. Assign ONE agent as the owner of each shared file
  2. Other agents may only read it
  3. If both need to add types, use separate files: auth.types.ts (backend), auth-ui.types.ts (frontend)

Key Rules

  • Include explicit scope boundaries in every subagent prompt
  • Log scope assignments in 02-plan.json handoff for traceability
  • After all agents complete, check for conflicting writes to the same file
  • Prefer worktree isolation (isolation: "worktree") to eliminate conflicts entirely
  • If an agent violates scope, discard its out-of-scope changes and re-run

Commit after each logical milestone — never batch all commits to session end — HIGH

Commit After Milestone

Commit working code after each logical unit of work completes. A "logical unit" is a phase that produces working, buildable output. Never accumulate changes across 3+ phases without committing.

Problem

Claude batches all commits to the end of a session. If the session dies mid-implementation (rate limit, timeout, network), all work is lost. The implement skill runs 10 phases — losing phases 1-7 because the commit was planned for phase 10 is catastrophic.

Commit Points

After PhaseCommit?Why
2. Micro-PlanYesPlan files are valuable context for resume
4. ArchitectureYesArchitecture decisions should survive crashes
5. ImplementationYesThe bulk of new code — highest risk of loss
6. Integration VerifiedYesTests pass, safe checkpoint
8. E2E VerifiedYesFull verification complete
10. ReflectionYesFinal commit with docs and lessons

Incorrect — one commit after all phases:

# Phase 1-2: Discovery + Planning (no commit)
# Phase 4: Architecture decided (no commit)
# Phase 5: 15 files implemented (no commit)
# Phase 6: Integration tests pass (no commit)
# Phase 8: E2E tests pass (no commit)
# --- rate limit hits here ---
# All work lost. No commits exist.

Correct — commit at each milestone:

# After Phase 2:
git add .claude/chain/02-plan.json src/docs/plan.md
git commit -m "plan: user auth micro-plan and task breakdown"

# After Phase 5:
git add src/auth/ tests/auth/
git commit -m "feat: implement user auth endpoints and tests"

# After Phase 6:
git add tests/integration/
git commit -m "test: integration verification for user auth"

# After Phase 8:
git add tests/e2e/
git commit -m "test: e2e verification for user auth flow"

# After Phase 10:
git add docs/ .claude/chain/
git commit -m "docs: user auth reflection and lessons learned"

Commit Message Format

<type>: <what was completed>

Phase: <N> (<phase-name>)
Tier: <detected-tier>

Co-Authored-By: Claude <noreply@anthropic.com>

Key Rules

  • Never go more than 2 phases without a commit
  • Commit even if tests are not yet written — partial progress beats total loss
  • Use specific git add &lt;files&gt; instead of git add -A to avoid committing artifacts
  • If a phase fails, commit the passing phases first, then address the failure
  • Include phase number in commit messages for traceability during resume
  • Handoff JSON files (.claude/chain/*.json) should be committed alongside code

Block completion if new code has zero test coverage — tests are mandatory for every implementation — HIGH

Test Coverage Requirement

Every task in Phase 5 must produce both implementation code AND matching tests. The test-generator agent is not optional — it runs in parallel with implementation agents and its output is required before Phase 6.

Problem

Claude often treats tests as a "nice to have" and moves to integration/documentation phases without verifying that test-generator actually produced output. The Phase 9 gate catches this too late.

Per-Task Enforcement

Each Phase 5 task must include a test verification step:

# After each implementation agent completes:
task_output = TaskOutput(task_id)

# Check: does the output include test files?
Grep(pattern="describe\\(|it\\(|test\\(|def test_", glob="**/*.test.*")
Grep(pattern="def test_|class Test", glob="**/test_*.py")

# If 0 matches for new code paths → BLOCK Phase 6 entry

Incorrect — proceed without tests:

# Phase 5: Implementation
Agent(subagent_type="backend-system-architect",
  prompt="Implement user auth endpoints", run_in_background=True)
Agent(subagent_type="frontend-ui-developer",
  prompt="Implement login form", run_in_background=True)
# No test-generator agent spawned
# Phase 6: "Let's verify integration..." ← no tests exist to run

Correct — tests are parallel and mandatory:

# Phase 5: Implementation + Tests (parallel)
Agent(subagent_type="backend-system-architect",
  prompt="Implement user auth endpoints", run_in_background=True)
Agent(subagent_type="frontend-ui-developer",
  prompt="Implement login form", run_in_background=True)
Agent(subagent_type="test-generator",
  prompt="Generate tests for user auth: unit tests for endpoints,
  integration tests for auth flow, component tests for login form.
  Change types: API endpoint + UI component (see Test Requirements Matrix)",
  run_in_background=True)

# GATE: Verify before Phase 6
for agent in [backend, frontend, test_gen]:
    output = TaskOutput(agent.task_id)

if test_file_count == 0:
    # DO NOT proceed — return to Phase 5
    Agent(subagent_type="test-generator",
      prompt="BLOCKED: No tests found. Generate tests for: {files_created}")

Change Type to Test Mapping

Always reference the Test Requirements Matrix from SKILL.md when spawning test-generator:

ChangeMinimum Tests
API endpoint1 unit + 1 integration
DB migration1 migration test
UI component1 unit + 1 snapshot
Business logic2 unit tests

Key Rules

  • Spawn test-generator in the same message as implementation agents
  • Verify test output before advancing to Phase 6
  • If test-generator fails, re-run it — do not skip
  • Include test file paths in the 05-implementation.json handoff

Match implementation tier to assessed complexity — never over-engineer a simple task — HIGH

Tier Validation

The tier detected in Step 0 sets a hard ceiling on architecture complexity. Every phase must respect this ceiling — especially Phase 4 (Architecture) and Phase 5 (Implementation).

Tier Ceilings

TierMax PatternsMax FilesForbidden
1. InterviewFlat files, simple routes8-15DI containers, message queues, microservices
2. HackathonSingle file if possible5-10Abstract factories, hexagonal layers
3. MVPMVC monolith20-40CQRS, event sourcing, k8s manifests
4-5. Growth/EnterpriseFull patterns allowedNo limitNone

Problem

Claude defaults to enterprise-grade patterns regardless of project size. A take-home interview gets hexagonal architecture with ports/adapters when a flat Express app with 3 routes would score higher.

Incorrect — Tier 1 interview with enterprise architecture:

# Detected: Tier 1 (Interview, README says "take-home, 4-hour limit")
# Phase 4 agent prompt:
Agent(subagent_type="backend-system-architect",
  prompt="Design hexagonal architecture with DI container, repository pattern,
  CQRS for read/write separation, and event sourcing for the todo API")
# Result: 35 files, 4 abstraction layers for a CRUD app

Correct — Tier 1 interview with appropriate simplicity:

# Detected: Tier 1 (Interview, README says "take-home, 4-hour limit")
# Phase 4 agent prompt:
Agent(subagent_type="backend-system-architect",
  prompt="Design a simple flat-file Express app for the todo API.
  Tier: 1 (Interview). Max 10 files. No DI, no abstractions beyond MVC.
  Focus: working code, clear tests, clean README.")
# Result: 8 files, runs out of the box, easy to review

Validation check — add to Phase 4 handoff:

# In 04-architecture.json handoff:
{
  "tier": 1,
  "patterns_used": ["flat-routes", "single-db-file"],
  "tier_ceiling_respected": true,
  "justification": "Interview project — simplicity scores higher than abstraction"
}
# If patterns_used includes anything above the tier ceiling, STOP and simplify

Key Rules

  • Always pass the detected tier to every subagent prompt explicitly
  • If a subagent produces output exceeding the tier ceiling, reject and re-run with stricter constraints
  • When in doubt, under-engineer — simpler code is easier to review and extend
  • Tier upgrades require explicit user confirmation via AskUserQuestion

Always ExitWorktree after implementation — never leave orphaned worktrees — HIGH

Worktree Cleanup

When the implement skill uses EnterWorktree for isolation, it MUST call ExitWorktree before completing — regardless of success or failure. Orphaned worktrees block future git operations and leak disk space.

Problem

Claude enters a worktree for isolation but forgets to exit when the implementation finishes, errors out, or gets interrupted. The worktree directory and its lock file persist, causing git worktree add failures in the next session.

Incorrect — enter worktree, finish, but never exit:

# Phase 3: Enter worktree
EnterWorktree(name="feat-user-auth")

# Phase 5: Implementation agents run in worktree
Agent(subagent_type="backend-system-architect", prompt="Implement auth...")

# Phase 9: Documentation
# "Done! Here's what was implemented..."
# ← MISSING: ExitWorktree never called
# Next session: "fatal: 'feat-user-auth' is already checked out"

Correct — exit worktree in all code paths:

# Phase 3: Enter worktree
EnterWorktree(name="feat-user-auth")

# Phase 5-8: Implementation, verification, testing...
# (all work happens in worktree)

# Phase 9: Before documentation, merge and exit
Bash("git add -A && git commit -m 'feat: user auth implementation'")
ExitWorktree()  # Merges back to original branch and removes worktree

# Phase 10: Reflection (back in main working tree)

Correct — exit worktree even on failure:

# Phase 3: Enter worktree
EnterWorktree(name="feat-user-auth")
worktree_active = True

# Phase 5: Implementation fails
try:
    Agent(subagent_type="backend-system-architect", prompt="...")
except:
    # STILL clean up the worktree
    if worktree_active:
        Bash("git stash")  # Save partial work
        ExitWorktree()
        worktree_active = False
    raise

# Normal exit path
if worktree_active:
    ExitWorktree()

Verification

After ExitWorktree, confirm cleanup:

# Should NOT list the feature worktree
git worktree list
# Expected: only the main worktree
# /path/to/repo  abc1234 [main]

Key Rules

  • Every EnterWorktree must have a matching ExitWorktree
  • Call ExitWorktree BEFORE Phase 10 (Reflection) so reflection runs in the main tree
  • On error or early exit, still call ExitWorktree to prevent orphaning
  • If resuming a session that has an active worktree, check git worktree list first
  • Never assume the worktree will be cleaned up by a future session
  • Include worktree status in handoff JSON: "worktree_active": false after cleanup

References (16)

Agent Phases

Agent Phases Reference

128K Output Token Strategy

With Opus 4.6's 128K output tokens, each agent produces complete artifacts in a single pass. This reduces implementation from 17 agents across 4 phases to 14 agents across 3 phases.

MetricBefore (64K)After (128K)Agent Teams Mode
Phase 4 agents55 (unchanged)4 teammates + lead
Phase 5 agents85Same 4 teammates (persist)
Phase 6 agents44 (unchanged)1 (code-reviewer verdict) + lead tests
Total agents17144 teammates (reused across phases)
Full API + models2 passes1 pass1 pass (same)
Component + tests2 passes1 pass1 pass (same)
Complete feature4-6 passes2-3 passes1-2 passes (overlapping)
CommunicationLead relaysLead relaysPeer-to-peer messaging
Token costBaseline~Same~2.5x (full sessions)

Key principle: Prefer one comprehensive response over multiple incremental ones. Only split when scope genuinely exceeds 128K tokens.

Agent Teams advantage: Teammates persist across phases 4→5→6, so context is preserved. No re-explaining architecture to implementation agents — they already know it because they designed it.


Phase 4: Architecture Design (5 Agents)

All 5 agents launch in ONE message with run_in_background=true.

Agent 1: Workflow Architect

Agent(
  subagent_type="workflow-architect",
  model=MODEL_OVERRIDE,  # None inherits default; "opus" for large features (CC 2.1.72)
  prompt="""# Cache-optimized: stable content first (CC 2.1.73)
  ARCHITECTURE PLANNING — SINGLE-PASS OUTPUT

  Produce a COMPLETE implementation roadmap in one response:

  1. COMPONENT BREAKDOWN
     - Frontend components needed (with file paths)
     - Backend services/endpoints (with route paths)
     - Database schema changes (with table/column names)
     - AI/ML integrations (if any)

  2. DEPENDENCY GRAPH
     - What must be built first?
     - What can be parallelized?
     - Integration points between frontend/backend

  3. RISK ASSESSMENT
     - Technical challenges with mitigations
     - Performance concerns with benchmarks
     - Security considerations with OWASP mapping

  4. TASK BREAKDOWN
     - Concrete tasks for each agent
     - Estimated tool calls per task
     - Acceptance criteria per task

  Output: Complete implementation roadmap with task dependencies.
  Use full 128K output capacity — don't truncate or summarize.

  Feature: $ARGUMENTS""",
  run_in_background=true
)

Agent 2: Backend Architect

Agent(
  subagent_type="backend-system-architect",
  model=MODEL_OVERRIDE,
  prompt="""# Cache-optimized: stable content first (CC 2.1.73)
  COMPLETE BACKEND ARCHITECTURE — SINGLE PASS

  Standards: FastAPI, Pydantic v2, async/await, SQLAlchemy 2.0

  Produce ALL of the following in one response:
  1. API endpoint design (routes, methods, status codes, rate limits)
  2. Pydantic v2 request/response schemas with Field constraints
  3. SQLAlchemy 2.0 async model definitions with relationships
  4. Service layer patterns (repository + unit of work)
  5. Error handling (RFC 9457 Problem Details)
  6. Database migration strategy (tables, indexes, constraints)
  7. Testing strategy (unit + integration test outline)

  Include file paths for every artifact.
  Output: Complete backend implementation spec ready for coding.

  Feature: $ARGUMENTS""",
  run_in_background=true
)

Agent 3: Frontend Developer

Agent(
  subagent_type="frontend-ui-developer",
  model=MODEL_OVERRIDE,
  prompt="""# Cache-optimized: stable content first (CC 2.1.73)
  COMPLETE FRONTEND ARCHITECTURE — SINGLE PASS

  Standards: React 19, TypeScript strict, Zod, TanStack Query

  Produce ALL of the following in one response:
  1. Component hierarchy with file paths
  2. Zod schemas for ALL API responses
  3. State management approach (Zustand slices or React 19 hooks)
  4. TanStack Query configuration (keys, stale time, prefetching)
  5. Form handling with React Hook Form + Zod
  6. Loading states (skeleton components, not spinners)
  7. Error boundaries and fallback UI
  8. Accessibility requirements (WCAG 2.1 AA)

  Include Tailwind class specifications for key components.
  Output: Complete frontend implementation spec ready for coding.

  Feature: $ARGUMENTS""",
  run_in_background=true
)

Agent 4: LLM Integrator

Agent(
  subagent_type="llm-integrator",
  model=MODEL_OVERRIDE,
  prompt="""# Cache-optimized: stable content first (CC 2.1.73)
  AI/ML INTEGRATION ANALYSIS — SINGLE PASS

  Evaluate and design AI integration in one response:
  1. Does this feature need LLM? (justify yes/no)
  2. Provider selection (Anthropic/OpenAI/Ollama) with rationale
  3. Prompt template design (versioned, with Langfuse tracking)
  4. Function calling / tool definitions (if needed)
  5. Streaming strategy (SSE endpoint design)
  6. Caching strategy (prompt caching + semantic caching)
  7. Cost estimation (tokens per request, monthly projection)
  8. Fallback chain configuration

  Output: Complete AI integration spec or "No AI needed" with justification.

  Feature: $ARGUMENTS""",
  run_in_background=true
)

Phase 4 — Teams Mode

In Agent Teams mode, 4 teammates form a team (implement-\{feature-slug\}) instead of independent Task spawns. The workflow-architect role is handled by the lead or omitted for simpler features. Teammates message architecture decisions to each other in real-time.

See Agent Teams Full-Stack Pipeline for spawn prompts.


Phase 5: Implementation (5 Agents)

128K consolidation: Backend is 1 agent (was 2), frontend is 1 agent (was 3 incl. styling). Each produces complete working code in a single pass.

All 5 agents launch in ONE message with run_in_background=true.

Agent 1: Backend — Complete Implementation

Agent(
  subagent_type="backend-system-architect",
  prompt="""# Cache-optimized: stable content first (CC 2.1.73)
  IMPLEMENT COMPLETE BACKEND — SINGLE PASS (128K output)

  Generate ALL backend code in ONE response:

  1. API ROUTES (backend/app/api/v1/routes/)
     - All endpoints with full implementation
     - Dependency injection
     - Rate limiting decorators

  2. SCHEMAS (backend/app/schemas/)
     - Pydantic v2 request/response models
     - Field constraints and validators

  3. MODELS (backend/app/db/models/)
     - SQLAlchemy 2.0 async models
     - Relationships, constraints, indexes

  4. SERVICES (backend/app/services/)
     - Business logic with repository pattern
     - Error handling (RFC 9457)

  5. TESTS (backend/tests/)
     - Unit tests for services
     - Integration tests for endpoints
     - Fixtures and factories

  Write REAL code to disk using Write/Edit tools.
  Every file must be complete and runnable.
  Do NOT split across responses — use full 128K output.

  Feature: $ARGUMENTS
  Architecture: [paste Phase 4 backend spec]""",
  run_in_background=true
)

Agent 2: Frontend — Complete Implementation

Agent(
  subagent_type="frontend-ui-developer",
  prompt="""# Cache-optimized: stable content first (CC 2.1.73)
  IMPLEMENT COMPLETE FRONTEND — SINGLE PASS (128K output)

  Generate ALL frontend code in ONE response:

  1. COMPONENTS (frontend/src/features/[feature]/components/)
     - React 19 components with TypeScript strict
     - useOptimistic for mutations
     - Skeleton loading states
     - Motion animation presets from @/lib/animations

  2. API LAYER (frontend/src/features/[feature]/api/)
     - Zod schemas for all API responses
     - TanStack Query hooks with prefetching
     - MSW handlers for testing

  3. STATE (frontend/src/features/[feature]/store/)
     - Zustand slices or React 19 state hooks
     - Optimistic update reducers

  4. STYLING
     - Tailwind classes using @theme tokens
     - Responsive breakpoints (mobile-first)
     - Dark mode variants
     - All component states (hover, focus, disabled, loading)

  5. TESTS (frontend/src/features/[feature]/__tests__/)
     - Component tests with MSW
     - Hook tests
     - Zod schema tests

  Write REAL code to disk. Every file must be complete.
  Include styling inline — no separate styling agent needed.
  Do NOT split across responses — use full 128K output.

  Feature: $ARGUMENTS
  Architecture: [paste Phase 4 frontend spec]""",
  run_in_background=true
)

Agent 3: AI Integration (if needed)

Agent(
  subagent_type="llm-integrator",
  prompt="""# Cache-optimized: stable content first (CC 2.1.73)
  IMPLEMENT AI INTEGRATION — SINGLE PASS (128K output)

  Generate ALL AI integration code in ONE response:

  1. Provider setup and configuration
  2. Prompt templates (versioned)
  3. Function calling / tool definitions
  4. Streaming SSE endpoint
  5. Prompt caching configuration
  6. Fallback chain implementation
  7. Langfuse tracing integration
  8. Tests with VCR.py cassettes

  Write REAL code to disk. Skip if AI spec says "No AI needed".

  Feature: $ARGUMENTS
  Architecture: [paste Phase 4 AI spec]""",
  run_in_background=true
)

Agent 4: Test Suite — Complete Coverage

Agent(
  subagent_type="test-generator",
  prompt="""# Cache-optimized: stable content first (CC 2.1.73)
  GENERATE COMPLETE TEST SUITE — SINGLE PASS (128K output)

  IMPORTANT: Match test types to change type using the Test Requirements Matrix:
  - API endpoint → Unit + Integration + Contract (rules: integration-api, verification-contract, mocking-msw)
  - DB schema    → Migration + Integration (rules: integration-database, data-seeding-cleanup)
  - UI component → Unit + Snapshot + A11y (rules: unit-aaa-pattern, integration-component, a11y-testing)
  - Business logic → Unit + Property-based (rules: unit-aaa-pattern, pytest-execution, verification-techniques)
  - LLM/AI      → Unit + Eval (rules: llm-evaluation, llm-mocking)
  - Full-stack   → All of the above

  Follow the testing-unit/testing-e2e/testing-integration skill rules for each test type.

  Generate ALL tests in ONE response:

  1. UNIT TESTS
     - Python: pytest with factories (not raw dicts), AAA pattern
     - TypeScript: Vitest with meaningful assertions
     - Cover edge cases: empty input, errors, timeouts, rate limits

  2. INTEGRATION TESTS
     - API endpoint tests with TestClient
     - Database tests with fixtures
     - VCR.py cassettes for external HTTP calls
     - If docker-compose/testcontainers detected: test against REAL services

  3. CONTRACT / PROPERTY TESTS (if applicable)
     - Contract tests for API boundaries (verification-contract)
     - Property-based tests for business logic (verification-techniques)

  4. FIXTURES & FACTORIES
     - conftest.py with shared fixtures
     - Factory classes for test data
     - MSW handlers for frontend API mocking

  5. COVERAGE ANALYSIS
     - Run: poetry run pytest --cov=app --cov-report=term-missing
     - Run: npm test -- --coverage
     - Target: 80% minimum

  Write REAL test files to disk.
  Run tests after writing to verify they pass.
  Do NOT split across responses — use full 128K output.

  Feature: $ARGUMENTS""",
  run_in_background=true
)

Phase 5 — Teams Mode

In Agent Teams mode, the same 4 teammates from Phase 4 continue into implementation. Key difference: backend-architect messages the API contract to frontend-dev as soon as it's defined (not after full implementation), enabling overlapping work. Optionally, each teammate gets a dedicated worktree. See Team Worktree Setup.


Phase 6: Integration Verification (4 Agents)

Real-Service Detection

Before running integration tests, check for infrastructure:

# PARALLEL — detect real service testing capability
Glob(pattern="**/docker-compose*.yml")
Glob(pattern="**/testcontainers*")
Grep(pattern="testcontainers|docker-compose", glob="requirements*.txt")
Grep(pattern="testcontainers|docker-compose", glob="package.json")

If detected, run integration tests against real services (not just mocks). Reference testing-integration rules: integration-database, integration-api, data-seeding-cleanup.

Validation Commands

Backend:

poetry run alembic upgrade head  # dry-run
poetry run ruff check app/
poetry run ty check app/
poetry run pytest tests/unit/ -v --cov=app
# If docker-compose detected:
docker-compose -f docker-compose.test.yml up -d
poetry run pytest tests/integration/ -v
docker-compose -f docker-compose.test.yml down

Frontend:

npm run typecheck
npm run lint
npm run build
npm test -- --coverage

Agent 1: Backend Integration

Agent(
  subagent_type="backend-system-architect",
  prompt="""BACKEND INTEGRATION VERIFICATION

  Verify all backend code works together:
  1. Run alembic migrations (dry-run)
  2. Run ruff/mypy type checking
  3. Run full test suite with coverage
  4. Verify API endpoints respond correctly
  5. Fix any integration issues found

  This is verification, not new implementation.""",
  run_in_background=true
)

Agent 2: Frontend Integration

Agent(
  subagent_type="frontend-ui-developer",
  prompt="""FRONTEND INTEGRATION VERIFICATION

  Verify all frontend code works together:
  1. Run TypeScript type checking (tsc --noEmit)
  2. Run linting (biome/eslint)
  3. Run build (vite build)
  4. Run test suite with coverage
  5. Fix any integration issues found

  This is verification, not new implementation.""",
  run_in_background=true
)

Agent 3: Code Quality Review

Agent(
  subagent_type="code-quality-reviewer",
  prompt="""FULL QUALITY REVIEW — SINGLE PASS (128K output)

  Review ALL new code in one comprehensive report:
  1. Run all automated checks (lint, type, test, audit)
  2. Verify React 19 patterns (useOptimistic, Zod, assertNever)
  3. Check security (OWASP, secrets, input validation)
  4. Verify test coverage meets 80% threshold
  5. Check architectural compliance

  Produce structured review with APPROVE/REJECT decision.""",
  run_in_background=true
)

Agent 4: Security Audit

Agent(
  subagent_type="security-auditor",
  prompt="""SECURITY AUDIT — SINGLE PASS (128K output)

  Audit ALL new code in one comprehensive report:
  1. Run bandit/semgrep on Python code
  2. Run npm audit on JavaScript dependencies
  3. Run pip-audit on Python dependencies
  4. Grep for secrets (API keys, passwords, tokens)
  5. OWASP Top 10 verification
  6. Input validation coverage

  Produce structured security report with severity ratings.""",
  run_in_background=true
)

Security Checks

  • No hardcoded secrets
  • SQL injection prevention
  • XSS prevention
  • Proper input validation
  • npm audit / pip-audit

Phase 6 — Teams Mode

In Agent Teams mode, the code-reviewer has been reviewing continuously during Phase 5. Integration validation is lighter: the lead merges worktrees, runs integration tests, and collects the code-reviewer's final APPROVE/REJECT verdict. After Phase 6, the lead tears down the team (shutdown_request to all teammates + TeamDelete + worktree cleanup).


Phase 7: Scope Creep Detection

Launch workflow-architect to compare planned vs actual files/features. Score 0-10:

ScoreLevelAction
0-2MinimalProceed to reflection
3-5ModerateDocument and justify unplanned changes
6-8SignificantReview with user, potentially split PR
9-10MajorStop and reassess

See Scope Creep Detection for the full agent prompt.


Phase 8: E2E Verification

If UI changes were made, verify with agent-browser:

agent-browser open http://localhost:5173
agent-browser wait --load networkidle
agent-browser snapshot -i
agent-browser screenshot /tmp/feature.png
agent-browser close

Skip this phase for backend-only or library implementations.


Phase 9: Documentation

Save implementation decisions to the knowledge graph for future reference:

mcp__memory__create_entities(entities=[{
  "name": "impl-{feature}-{date}",
  "entityType": "ImplementationDecision",
  "observations": ["chose X over Y because...", "pattern: ..."]
}])

Phase 10: Post-Implementation Reflection & Cleanup

Worktree Cleanup (CC 2.1.72)

If worktree isolation was used in Step 0, exit it before committing:

# Exit worktree — keep branch for PR creation
ExitWorktree(action="keep")
# Verify no orphaned worktrees remain
# Run: git worktree list

Every EnterWorktree must have a matching ExitWorktree. If the session crashes before cleanup, the next session should detect and clean up orphaned worktrees via git worktree list + git worktree remove.

Reflection

Launch workflow-architect to evaluate:

  • What went well / what to improve
  • Estimation accuracy (actual vs planned time)
  • Reusable patterns to extract
  • Technical debt created
  • Knowledge gaps discovered

Store lessons in memory for future implementations.

Agent Teams Full Stack

Agent Teams: Full-Stack Feature Pipeline

Team formation template for Pipeline 2 — Full-Stack Feature using CC Agent Teams.

Agents: 4 teammates + lead Topology: Mesh — backend hands off API contract to frontend, test-engineer works incrementally Lead mode: Delegate (coordination only, no code)


Team Formation

Team Name Pattern

implement-{feature-slug}

Example: implement-user-auth, implement-dashboard-analytics

Teammate Spawn Prompts

1. backend-architect (backend-system-architect)

You are the backend-architect specialist on this team.

## Your Role
Design and implement the complete backend: API routes, service layer, database models,
schemas, and backend tests. You own the API contract.

## Your Task
Implement the backend for: {feature description}

1. Define API endpoints (routes, methods, schemas, status codes)
2. Create Pydantic v2 request/response models
3. Implement service layer with repository pattern
4. Create SQLAlchemy 2.0 async models + migrations
5. Write backend unit and integration tests
6. Handle errors with RFC 9457 Problem Details

## Coordination Protocol
- AS SOON AS your API contract is defined (routes + request/response types),
  message frontend-dev with the contract. Don't wait for full implementation.
- When database schema is ready, update the shared task list.
- If you change the API contract after sharing it, message frontend-dev immediately.
- If blocked, message the lead with what you need.

## Quality Requirements
- All code must pass ruff + type checking
- Include tests for every endpoint (happy path + error cases)
- Document API changes in OpenAPI format

2. frontend-dev (frontend-ui-developer)

You are the frontend-dev specialist on this team.

## Your Role
Implement the complete frontend: React components, state management, API integration,
styling, and frontend tests. You consume the API contract from backend-architect.

## Your Task
Implement the frontend for: {feature description}

1. Wait for API contract from backend-architect (types + routes)
2. Create Zod schemas matching the API contract
3. Build React 19 components with TypeScript strict
4. Implement TanStack Query hooks for data fetching
5. Add form handling with React Hook Form + Zod
6. Style with Tailwind (mobile-first, dark mode)
7. Write component and hook tests with MSW

## Coordination Protocol
- WAIT for backend-architect to message you with the API contract before building
  API integration. You CAN start on UI layout and component structure immediately.
- When component interfaces (exports, props) are stable, message test-engineer
  so they can write integration tests.
- If the API contract changes, adapt and message test-engineer about the update.
- If blocked, message the lead with what you need.

## Quality Requirements
- TypeScript strict mode, no `any` types
- Skeleton loading states (not spinners)
- WCAG 2.1 AA accessibility
- All components tested with MSW mocking

3. test-engineer (test-generator)

You are the test-engineer specialist on this team.

## Your Role
Write comprehensive tests incrementally as contracts stabilize. Don't wait for
full implementation — test as soon as interfaces are defined.

## Your Task
Build the test suite for: {feature description}

1. Start writing test fixtures and factories immediately
2. When backend-architect shares API contract, write API integration tests
3. When frontend-dev shares component interfaces, write component tests
4. Add E2E test scenarios covering the full user flow
5. Run all tests and report coverage

## Coordination Protocol
- You do NOT need to wait for anyone. Start with fixtures, factories, and test plans.
- Monitor the shared task list for contract updates from backend-architect and frontend-dev.
- When tests uncover issues, message the responsible teammate directly:
  - API issues → message backend-architect
  - UI issues → message frontend-dev
- Update the shared task list with coverage metrics as tests pass.

## Quality Requirements
- 80% minimum coverage target
- Use factories (not raw dicts) for test data
- MSW handlers for frontend API mocking
- VCR.py cassettes for external HTTP calls
- Every edge case: empty input, errors, timeouts, rate limits

4. code-reviewer (code-quality-reviewer)

You are the code-reviewer specialist on this team.

## Your Role
Review code as it lands. Don't wait for completion — review incrementally.
Flag issues directly to the author. Require plan approval before making changes.

## Your Task
Review all code for: {feature description}

1. Monitor files as they're written by backend-architect, frontend-dev, and test-engineer
2. Run automated checks: lint, typecheck, security scan
3. Verify architectural compliance (clean architecture, separation of concerns)
4. Check for OWASP Top 10 vulnerabilities
5. Verify test quality (meaningful assertions, not just coverage)

## Coordination Protocol
- Review continuously — don't wait for teammates to finish.
- When you find issues, message the responsible teammate directly with:
  - File path and line number
  - What's wrong and why
  - Suggested fix
- For blocking issues (security vulnerabilities, architectural violations),
  also message the lead.
- Update the shared task list with review status per teammate.

## Quality Requirements
- Zero critical/high security findings
- TypeScript strict compliance
- No hardcoded secrets or credentials
- Consistent error handling patterns
- Produce final APPROVE/REJECT decision for the lead

Coordination Messaging Templates

Backend → Frontend: API Contract Handoff

Subject: API contract ready for {feature}

Here are the endpoint definitions:

## Endpoints
- POST /api/v1/{resource} — Create
  Request: { field1: string, field2: number }
  Response: { id: string, ...fields, created_at: string }
  Status: 201

- GET /api/v1/{resource}/:id — Read
  Response: { id: string, ...fields }
  Status: 200

- PUT /api/v1/{resource}/:id — Update
  Request: { field1?: string, field2?: number }
  Response: { id: string, ...fields, updated_at: string }
  Status: 200

## TypeScript Types (for your Zod schemas)
[paste Pydantic models converted to TS interfaces]

## Error Format
RFC 9457: { type, title, status, detail, instance }

You can start building API integration now.
I'll message you if anything changes.

Frontend → Test Engineer: Component Interface Handoff

Subject: Component interfaces ready for {feature}

## Exported Components
- <FeatureList /> — props: { items: Item[], onSelect: (id: string) => void }
- <FeatureDetail /> — props: { id: string }
- <FeatureForm /> — props: { onSubmit: (data: FormData) => Promise<void> }

## Query Hooks
- useFeatures() → { data: Item[], isLoading, error }
- useFeature(id) → { data: Item, isLoading, error }
- useCreateFeature() → { mutate, isPending }

## MSW Handlers
Located at: src/features/{feature}/__tests__/handlers.ts

You can start writing component and integration tests now.

Any → Lead: Blocked Notification

Subject: BLOCKED — {brief description}

I'm blocked on: {what's blocking}
Waiting for: {who/what}
Impact: {what can't proceed}
Suggested resolution: {what would unblock}

Per-Teammate Worktree Setup

See Team Worktree Setup for detailed instructions.

Quick summary:

# Lead creates branches and worktrees
git branch feat/{feature}/backend
git branch feat/{feature}/frontend
git branch feat/{feature}/tests

git worktree add ../{project}-backend feat/{feature}/backend
git worktree add ../{project}-frontend feat/{feature}/frontend
git worktree add ../{project}-tests feat/{feature}/tests

# Assignment
backend-architect ../{project}-backend/
frontend-dev ../{project}-frontend/
test-engineer ../{project}-tests/
code-reviewer Main worktree (read-only, reviews all)

When to skip worktrees: Small features (< 5 files), or when teammates work on non-overlapping directories.


Lead Synthesis Protocol

After all teammates complete (or when all tasks are done):

  1. Merge worktrees (if used):

    git checkout feat/{feature}
    git merge --squash feat/{feature}/backend
    git commit -m "feat({feature}): backend implementation"
    git merge --squash feat/{feature}/frontend
    git commit -m "feat({feature}): frontend implementation"
    git merge --squash feat/{feature}/tests
    git commit -m "test({feature}): complete test suite"
  2. Resolve conflicts — typically in shared types/interfaces

  3. Run integration tests from the merged branch:

    npm test
    npm run typecheck
    npm run lint
  4. Collect code-reviewer verdict — APPROVE or REJECT with findings

  5. Shut down team:

    SendMessage(type="shutdown_request", recipient="backend-architect")
    SendMessage(type="shutdown_request", recipient="frontend-dev")
    SendMessage(type="shutdown_request", recipient="test-engineer")
    SendMessage(type="shutdown_request", recipient="code-reviewer")
    TeamDelete()
    
    # Worktree cleanup (CC 2.1.72)
    ExitWorktree(action="keep")  # Keep branch for PR

Cost Comparison

MetricTask Tool (5 sequential)Agent Teams (4 mesh)
Expected tokens~500K~1.2M
Wall-clock timeSequential phasesOverlapping (30-40% faster)
API contract handoffLead relaysPeer-to-peer (immediate)
Cross-agent rework~15% (wrong API shapes)< 5% (contract shared early)
Quality gateAfter all completeContinuous (reviewer on team)

When Teams is worth the cost:

  • Frontend and backend need to agree on API shape
  • Feature has > 5 files across both stacks
  • Complexity score >= 3.0

When Task tool is cheaper and sufficient:

  • Backend-only or frontend-only scope
  • Independent tasks (audit, test generation)
  • Simple CRUD with clear schema

When to Use

  • Use Agent Teams for cross-cutting full-stack features where API contract coordination matters
  • Use Task Tool for simpler features where agents work independently
  • Complexity threshold: Average score >= 3.0 across 7 dimensions (use /ork:quality-gates)
  • Override: Set ORCHESTKIT_PREFER_TEAMS=1 to always use Agent Teams

Agent Teams Phases

Agent Teams Phase Alternatives

This reference consolidates Agent Teams mode instructions for Phases 4, 5, 6, and 6b of the implement workflow.

Phase 4 — Agent Teams Architecture Design

In Agent Teams mode, form a team instead of spawning 5 independent Tasks. Teammates message architecture decisions to each other in real-time:

TeamCreate(team_name="implement-{feature-slug}", description="Architecture for {feature}")

# Spawn 4 teammates (5th role — UX — is lead-managed or optional)
Agent(subagent_type="backend-system-architect", name="backend-architect",
     team_name="implement-{feature-slug}", model=MODEL_OVERRIDE,
     prompt="Design backend architecture. Message frontend-dev when API contract ready.")

Agent(subagent_type="frontend-ui-developer", name="frontend-dev",
     team_name="implement-{feature-slug}", model=MODEL_OVERRIDE,
     prompt="Design frontend architecture. Wait for API contract from backend-architect.")

Agent(subagent_type="test-generator", name="test-engineer",
     team_name="implement-{feature-slug}", model=MODEL_OVERRIDE,
     prompt="Plan test strategy. Start fixtures immediately, tests as contracts stabilize.")

Agent(subagent_type="code-quality-reviewer", name="code-reviewer",
     team_name="implement-{feature-slug}", model=MODEL_OVERRIDE,
     prompt="Review architecture decisions as they're shared. Flag issues to author directly.")

See Agent Teams Full-Stack Pipeline for complete spawn prompts and messaging templates.

Fallback: If team formation fails, fall back to 5 independent Task spawns (standard Phase 4).


Phase 5 — Agent Teams Implementation

In Agent Teams mode, teammates are already formed from Phase 4. They transition from architecture to implementation and message contracts to each other:

  • backend-architect implements the API and messages frontend-dev with the contract (types + routes) as soon as endpoints are defined — not after full implementation.
  • frontend-dev starts building UI layout immediately, then integrates API hooks once the contract arrives.
  • test-engineer writes tests incrementally as contracts stabilize. Reports failing tests directly to the responsible teammate.
  • code-reviewer reviews code as it lands. Flags issues to the author directly.

Optionally set up per-teammate worktrees to prevent file conflicts:

# Lead sets up worktrees (for features with > 5 files)
Bash("git worktree add ../{project}-backend feat/{feature}/backend")
Bash("git worktree add ../{project}-frontend feat/{feature}/frontend")
Bash("git worktree add ../{project}-tests feat/{feature}/tests")

# Include worktree path in teammate messages
SendMessage(type="message", recipient="backend-architect",
    content="Work in ../{project}-backend/. Commit to feat/{feature}/backend.")

See Team Worktree Setup for complete worktree guide.

Fallback: If teammate coordination breaks down, shut down the team and fall back to 5 independent Task spawns (standard Phase 5).


Phase 6 — Agent Teams Integration

In Agent Teams mode, the code-reviewer teammate has already been reviewing code during implementation (Phase 5). Integration verification is lighter:

  • code-reviewer produces final APPROVE/REJECT verdict based on cumulative review.
  • Lead runs integration tests across the merged codebase (or merged worktrees).
  • No need for separate security-auditor spawn — code-reviewer covers security checks. For high-risk features, spawn a security-auditor teammate in Phase 4.
# Lead runs integration after merging worktrees
Bash("npm test && npm run typecheck && npm run lint")

# Collect code-reviewer verdict
SendMessage(type="message", recipient="code-reviewer",
    content="All code merged. Please provide final APPROVE/REJECT verdict.")

Fallback: If code-reviewer verdict is unclear, fall back to 4 independent Task spawns (standard Phase 6).


Phase 6b — Team Teardown (Agent Teams Only)

After Phase 6 completes in Agent Teams mode, tear down the team:

1. Merge Worktrees (if used)

git checkout feat/{feature}
git merge --squash feat/{feature}/backend && git commit -m "feat({feature}): backend"
git merge --squash feat/{feature}/frontend && git commit -m "feat({feature}): frontend"
git merge --squash feat/{feature}/tests && git commit -m "test({feature}): test suite"

2. Shut Down Teammates

SendMessage(type="shutdown_request", recipient="backend-architect",
    content="Implementation complete, shutting down team.")
SendMessage(type="shutdown_request", recipient="frontend-dev",
    content="Implementation complete, shutting down team.")
SendMessage(type="shutdown_request", recipient="test-engineer",
    content="Implementation complete, shutting down team.")
SendMessage(type="shutdown_request", recipient="code-reviewer",
    content="Implementation complete, shutting down team.")

3. Clean Up

TeamDelete()  # Remove team and shared task list

# Worktree cleanup (CC 2.1.72)
ExitWorktree(action="keep")  # Keep branch for PR

Phases 7-10 (Scope Creep, E2E Verification, Documentation, Reflection) are the same in both modes — the team is already disbanded.

Agent Teams Security Audit

Agent Teams: Security Audit Pipeline

Team formation template for Pipeline 4 — Security Audit using CC Agent Teams.

Agents: 3 (all read-only, no file conflicts) Topology: Mesh — auditors share findings with each other Lead mode: Delegate (coordination only)


Team Formation

Team Name Pattern

security-audit-{timestamp}

Teammate Spawn Prompts

1. security-auditor (OWASP + Dependencies)

You are the security-auditor specialist on this team.

## Your Role
Scan codebase for vulnerabilities, audit dependencies, and verify OWASP Top 10 compliance.
Focus on: dependency CVEs, hardcoded secrets, injection patterns, auth weaknesses.

## Your Task
Run a security audit on the hooks subsystem (src/hooks/). Focus on:
1. Dependency vulnerabilities (npm audit)
2. Secret/credential patterns in source
3. Injection risks (eval, exec, command injection)
4. Input validation on hook inputs
5. OWASP Top 10 applicability

## Coordination Protocol
- When you find critical/high findings, message security-layer-auditor to verify
  which defense layer is affected
- When you find LLM-related issues, message ai-safety-auditor for cross-reference
- Update the shared task list when you complete each scan area
- If blocked, message the lead

## Output
Return findings as structured JSON with severity, location, and remediation.

2. security-layer-auditor (Defense-in-Depth)

You are the security-layer-auditor specialist on this team.

## Your Role
Verify defense-in-depth implementation across 8 security layers (edge to storage).
Map every finding to a specific layer and assess coverage gaps.

## Your Task
Audit the hooks subsystem (src/hooks/) across all applicable security layers:
1. Layer 2 (Input): How are hook inputs validated?
2. Layer 3 (Authorization): How are tool permissions enforced?
3. Layer 4 (Data Access): How is file system access controlled?
4. Layer 5 (LLM): How is prompt content handled in hooks?
5. Layer 7 (Storage): How are lock files and coordination data stored?

## Coordination Protocol
- When security-auditor shares findings, map them to specific layers
- Validate whether existing controls contain the identified threats
- Share layer gap analysis with ai-safety-auditor for LLM-specific layers
- Update the shared task list when you complete each layer

## Output
Return an 8-layer audit matrix with status (pass/fail/partial) per layer.

3. ai-safety-auditor (LLM Security)

You are the ai-safety-auditor specialist on this team.

## Your Role
Audit LLM integration security. Focus on prompt injection, tool poisoning,
excessive agency, and OWASP LLM Top 10 compliance.

## Your Task
Audit the hooks subsystem (src/hooks/) for AI safety:
1. Prompt injection risks in context-injection hooks
2. Tool poisoning vectors in MCP integration
3. Excessive agency in automated hook actions
4. Data leakage through hook outputs
5. OWASP LLM Top 10 applicability

## Coordination Protocol
- Cross-reference with security-auditor findings for injection risks
- Cross-reference with security-layer-auditor for Layer 5/6 gaps
- If you find a finding that contradicts another auditor, flag the disagreement
- Update the shared task list when you complete each assessment area

## Output
Return OWASP LLM Top 10 compliance matrix plus specific findings.

Lead Synthesis Protocol

After all teammates complete:

  1. Collect all three audit reports
  2. Cross-reference findings — same issue found by multiple auditors = higher confidence
  3. Highlight disagreements — auditors may rate severity differently
  4. Deduplicate — merge equivalent findings
  5. Produce unified report with:
    • Combined findings sorted by severity
    • Layer coverage matrix
    • OWASP compliance summary
    • Prioritized remediation plan

Cost Comparison Baseline

MetricTask Tool (3 sequential)Agent Teams (3 mesh)
Expected tokens~150K~400K
Wall-clock timeSequential (3x)Parallel (1x)
Cross-referenceManual by leadPeer-to-peer
Finding qualityIndependentCorroborated

Track actual values to validate.


When to Use

  • Use Agent Teams when auditors need to cross-reference findings in real-time
  • Use Task Tool for quick, independent audits (single agent sufficient)
  • Complexity threshold: Average score >= 3.0 across 7 dimensions

Cc Enhancements

CC 2.1.30+ Enhancements

Task Metrics

Task tool results now include token_count, tool_uses, and duration_ms. Use for scope monitoring:

## Phase 5 Metrics (Implementation)
| Agent | Tokens | Tools | Duration |
|-------|--------|-------|----------|
| backend-system-architect #1 | 680 | 15 | 25s |
| backend-system-architect #2 | 540 | 12 | 20s |
| frontend-ui-developer #1 | 720 | 18 | 30s |

**Scope Check:** If token_count > 80% of budget, flag scope creep

Tool Usage Guidance (CC 2.1.31)

Use the right tools for each operation:

TaskUseAvoid
Find files by patternGlob("**/*.ts")bash find
Search codeGrep(pattern="...", glob="*.ts")bash grep
Read specific fileRead(file_path="/abs/path")bash cat
Edit/modify codeEdit(file_path=...)bash sed/awk
Parse file contentsRead with limit/offsetbash head/tail
Git operationsBash git ...(git needs bash)
Run tests/buildBash npm/poetry ...(CLIs need bash)

Session Resume Hints (CC 2.1.31)

Before ending implementation sessions, capture context:

/ork:remember Implementation of {feature}:
  Completed: phases 1-6
  Remaining: verification, docs
  Key decisions: [list]
  Blockers: [if any]

Resume later with full context preserved.

E2e Verification

E2E Verification Guide

Concrete steps for Phase 8 end-to-end verification.

Browser Testing (UI features)

# Use agent-browser CLI for visual verification
Bash("agent-browser open http://localhost:3000/{route}")
Bash("agent-browser snapshot")  # Capture DOM state
Bash("agent-browser screenshot /tmp/e2e-{feature}.png")
Read("/tmp/e2e-{feature}.png")  # Visual inspection

API Testing (Backend features)

# Verify endpoints return expected responses
curl -s http://localhost:8000/api/{endpoint} | jq .

# Run integration test suite against running server
pytest tests/integration/ -v --tb=short

# If docker-compose exists, test against real services
docker-compose -f docker-compose.test.yml up -d
pytest tests/integration/ -v
docker-compose -f docker-compose.test.yml down

Full-Stack Verification

  1. Start backend: verify API responses with curl/httpie
  2. Start frontend: verify pages render with agent-browser
  3. Test critical user flows end-to-end
  4. Verify error states (invalid input, network failure, auth failure)

What to Check

AspectHow
Happy pathComplete the primary user flow
Error handlingSubmit invalid data, check error messages
Auth boundariesAccess protected routes without auth
Data persistenceCreate → Read → Update → Delete cycle
PerformancePage load under 3s, API response under 500ms

When to Skip

  • Tier 1-2 (Interview/Hackathon): Skip browser E2E, manual verification sufficient
  • No UI changes: Skip browser testing, API tests only
  • Config-only changes: Skip E2E entirely

Feedback Loop

Continuous Feedback Loop

Maintain a feedback loop throughout implementation.

After Each Task Completion

Quick checkpoint:

  • What was completed
  • Tests pass/fail
  • Actual vs estimated time
  • Blockers encountered
  • Scope deviations

Update task status with TaskUpdate(taskId, status="completed").

Feedback Triggers

TriggerAction
Task takes 2x estimated timePause, reassess scope
Test keeps failingConsider design issue, not just implementation
Scope creep detectedStop, discuss with user
Blocker foundCreate blocking task, switch to parallel work

Interview Mode

Interview / Take-Home Mode

When project tier is detected as Interview (STEP 0), apply these constraints:

Constraints

ConstraintValue
Max files8-15
Max LOC200-600
ArchitectureFlat (no layers)
Skip phases2 (Micro-Planning), 3 (Worktree), 7 (Scope Creep), 8 (E2E Browser), 10 (Reflection)
AgentsMax 2 (1 backend + 1 frontend, or 1 full-stack)
CI/ObservabilitySkip entirely

README Template

Include a "What I Would Change for Production" section:

  • Database: would add migrations, connection pooling
  • Auth: would add OAuth/JWT instead of basic auth
  • Testing: would add integration + e2e tests
  • Monitoring: would add structured logging, health checks

This section demonstrates production awareness without over-engineering the take-home. Reviewers value this signal.

Micro Planning Guide

Micro-Planning Guide

Create detailed task-level plans before writing code to prevent scope creep and improve estimates.

What to Include

SectionPurpose
Scope (IN)Explicit list of what will change
Out of ScopeWhat NOT to touch (prevents creep)
Files to TouchExact files, change type, description
Acceptance CriteriaHow to know it's done
Estimated TimeRealistic time budget

Planning Process

Step 1: Define Scope Boundaries

### IN Scope
- Add User model with email, password_hash
- Add /register endpoint
- Add validation for email format

### OUT of Scope
- Password reset (separate task)
- OAuth providers (future task)
- Email verification (future task)

Step 2: List Files Explicitly

### Files to Touch
| File | Action | Description |
|------|--------|-------------|
| models/user.py | CREATE | User SQLAlchemy model |
| api/auth.py | CREATE | Register endpoint |
| tests/test_auth.py | CREATE | Registration tests |
| alembic/versions/xxx.py | CREATE | Migration |

Step 3: Set Acceptance Criteria

### Acceptance Criteria
- [ ] POST /register creates user
- [ ] Duplicate email returns 409
- [ ] Invalid email returns 422
- [ ] Password is hashed (not plaintext)
- [ ] Tests pass
- [ ] Types check

Time-Boxing Techniques

Task SizeTime BoxBreak Point
Small (1-3 files)30 min45 min
Medium (4-8 files)2 hours3 hours
Large (9+ files)4 hoursSplit task

At Break Point

  1. Stop and assess progress
  2. If not 50%+ done, re-estimate
  3. If blocked, create blocker task
  4. Consider splitting remaining work

When to Break Down Further

Split the task if:

  • More than 8 files to modify
  • Estimate exceeds 4 hours
  • Multiple unrelated changes
  • Requires learning new technology
  • Has uncertain requirements

Anti-Patterns

Anti-PatternFix
Vague scope: "Add auth"Specific: "Add /register endpoint"
No out-of-scope sectionAlways list what's excluded
Missing time estimateAlways estimate, even if rough
No acceptance criteriaDefine "done" before starting

Orchestration Modes

Orchestration Mode Selection

Decision Logic

# Agent Teams is GA since CC 2.1.33 (Issue #362)
import os
force_task_tool = os.environ.get("ORCHESTKIT_FORCE_TASK_TOOL") == "1"

if force_task_tool:
    mode = "task_tool"
else:
    # Teams available by default — use it for non-trivial work
    mode = "agent_teams" if avg_complexity >= 2.5 else "task_tool"

Comparison Table

AspectTask Tool (star)Agent Teams (mesh)
CommunicationAll agents report to lead onlyTeammates message each other
API contractLead relays between agentsBackend messages frontend directly
Cost~500K tokens (full-stack)~1.2M tokens (full-stack)
Wall-clockSequential phasesOverlapping (30-40% faster)
Quality reviewAfter all agents completeContinuous (reviewer on team)
Best forIndependent tasks, low complexityCross-cutting features, high complexity

Fallback

If Agent Teams mode encounters issues (teammate failures, messaging problems), fall back to Task tool mode for remaining phases. The approaches are compatible — work done in Teams mode transfers to Task tool continuation.

Scope Creep Detection

Scope Creep Detection

Identify when implementation exceeds original scope and take corrective action.

Warning Signs

IndicatorExample
"While I'm here..."Refactoring unrelated code
Premature optimizationAdding caching before measuring
GoldplatingExtra UI polish not requested
Future-proofing"We might need this later"
Rabbit holesDeep debugging unrelated issues

Detection Checklist

Files Changed vs Planned

[ ] List files in original micro-plan
[ ] List files actually modified (git diff --name-only)
[ ] Flag any file not in original plan
[ ] Each unplanned file needs justification

Features Added vs Planned

[ ] Compare implemented features to acceptance criteria
[ ] Identify features not in original scope
[ ] Mark as: necessary dependency / nice-to-have / out-of-scope

Time Spent vs Estimated

[ ] Original estimate: ___ hours
[ ] Actual time: ___ hours
[ ] If >1.5x estimate, identify cause

Quick Audit Command

# Compare planned vs actual files
git diff --name-only main...HEAD | sort > /tmp/actual.txt
# Compare against micro-plan's "Files to Touch" section
diff /tmp/planned.txt /tmp/actual.txt

Scope Creep Score

ScoreLevelAction
0-2MinimalProceed normally
3-5ModerateDocument, justify each addition
6-8SignificantDiscuss with user, consider splitting
9-10MajorStop, split into separate PR

Recovery Strategies

If Score 3-5 (Moderate)

  1. Document unplanned changes in PR description
  2. Add "bonus" label to extra features
  3. Ensure tests cover additions

If Score 6-8 (Significant)

  1. Revert unplanned changes to separate branch
  2. Create follow-up issue for extras
  3. Submit minimal PR matching original scope

If Score 9-10 (Major)

  1. Stop implementation
  2. Split into multiple PRs
  3. Re-scope with user before continuing

Prevention Tips

  • Review micro-plan before starting each file
  • Time-box exploration (15 min max)
  • Ask "Is this in scope?" before each change
  • Use TODO comments for out-of-scope ideas

Team Worktree Setup

Team Worktree Setup

Per-teammate git worktree management for Agent Teams. Extends the general Worktree Workflow with team-specific patterns.


Branch Naming Convention

feat/{feature}/{role}

Examples:

  • feat/user-auth/backend
  • feat/user-auth/frontend
  • feat/user-auth/tests
  • feat/dashboard/backend
  • feat/dashboard/frontend

All branches are created from the feature branch (not main):

# Start from the feature branch
git checkout feat/{feature}

# Create role branches
git branch feat/{feature}/backend
git branch feat/{feature}/frontend
git branch feat/{feature}/tests

Worktree Setup Commands

The lead creates worktrees before spawning teammates:

# Create worktrees — one per implementing teammate
git worktree add ../{project}-backend feat/{feature}/backend
git worktree add ../{project}-frontend feat/{feature}/frontend
git worktree add ../{project}-tests feat/{feature}/tests

# Verify
git worktree list

Directory layout after setup:

../
├── {project}/              ← Main worktree (lead + code-reviewer)
├── {project}-backend/      ← backend-architect works here
├── {project}-frontend/     ← frontend-dev works here
└── {project}-tests/        ← test-engineer works here

Teammate Assignment

Include the worktree path in each teammate's spawn prompt:

TeammateWorktreeWorking Directory
backend-architect../\{project\}-backend/Full project access, writes to backend dirs
frontend-dev../\{project\}-frontend/Full project access, writes to frontend dirs
test-engineer../\{project\}-tests/Full project access, writes to test dirs
code-reviewerMain worktreeRead-only, reviews across all worktrees

Spawn prompt addition:

## Your Working Directory
Work EXCLUSIVELY in: /path/to/{project}-backend/
Do NOT modify files in other worktrees.
Commit your changes to the feat/{feature}/backend branch.

Merge Strategy

After all teammates complete, the lead merges each role branch:

# Switch to feature branch
git checkout feat/{feature}

# Merge each role as a single commit
git merge --squash feat/{feature}/backend
git commit -m "feat({feature}): backend implementation"

git merge --squash feat/{feature}/frontend
git commit -m "feat({feature}): frontend implementation"

git merge --squash feat/{feature}/tests
git commit -m "test({feature}): complete test suite"

Handling Merge Conflicts

Conflicts typically occur in shared files:

  • Type definitions — backend and frontend may define overlapping types
  • Package files — both may add dependencies
  • Config files — shared configuration

Resolution priority:

  1. Backend types are authoritative (they own the API contract)
  2. For package conflicts, combine both additions
  3. For config conflicts, merge manually

Cleanup

After successful merge and verification:

# Remove worktrees
git worktree remove ../{project}-backend
git worktree remove ../{project}-frontend
git worktree remove ../{project}-tests

# Delete role branches
git branch -d feat/{feature}/backend
git branch -d feat/{feature}/frontend
git branch -d feat/{feature}/tests

# Verify cleanup
git worktree list
git branch --list "feat/{feature}/*"

When to Skip Worktrees

Not every Agent Teams session needs worktrees. Skip when:

ConditionSkip Worktrees?Reason
Read-only roles only (audit, review)YesNo file writes = no conflicts
Small feature (< 5 files)YesFile overlap unlikely
Teammates work in non-overlapping directoriesYesNatural isolation
Single-stack scope (backend-only or frontend-only)YesOne writer, others are reviewers
Research/debugging taskYesExploration, not implementation

When skipping worktrees, teammates work in the same directory. The lead should assign clear file ownership in spawn prompts to prevent conflicts:

## File Ownership
You own: src/api/, src/models/, src/services/
Do NOT modify: src/components/, src/features/, src/hooks/

Config Sharing (CC 2.1.63+)

Project configs and auto-memory are automatically shared across worktrees (CC 2.1.63+). No manual setup needed:

  • .claude/settings.json and CLAUDE.md available in every worktree
  • Auto-memory persists — teammates inherit learned patterns
  • Plugins are discovered from any worktree

Worktree + Agent Teams Checklist

Before spawning teammates:

  • Feature branch exists (feat/\{feature\})
  • Role branches created from feature branch
  • Worktrees added for each implementing teammate
  • Each teammate's spawn prompt includes worktree path
  • Code-reviewer assigned to main worktree (read-only)

After all teammates complete:

  • All role branches have commits
  • Squash merge each role into feature branch
  • Merge conflicts resolved
  • Integration tests pass on merged branch
  • Worktrees removed
  • Role branches deleted

Test Requirements Matrix

Test Requirements Matrix

Phase 5 test-generator MUST produce tests matching the change type.

Required Tests by Change Type

Change TypeRequired TestsTesting Rules
API endpointUnit + Integration + Contractintegration-api, verification-contract, mocking-msw
DB schema/migrationMigration + Integrationintegration-database, data-seeding-cleanup
UI componentUnit + Snapshot + A11yunit-aaa-pattern, integration-component, a11y-testing, e2e-playwright
Business logicUnit + Property-basedunit-aaa-pattern, pytest-execution, verification-techniques
LLM/AI featureUnit + Evalllm-evaluation, llm-mocking
Full-stack featureAll of the aboveAll matching rules

Real-Service Detection (Phase 6)

Before running integration tests, detect infrastructure:

# Auto-detect real service testing capability (PARALLEL)
Glob(pattern="**/docker-compose*.yml")
Glob(pattern="**/testcontainers*")
Grep(pattern="testcontainers|docker-compose", glob="requirements*.txt")
Grep(pattern="testcontainers|docker-compose", glob="package.json")

If detected: run integration tests against real services, not just mocks. Reference testing-integration rules: integration-database, integration-api, data-seeding-cleanup.

Phase 9 Gate

Do NOT proceed to Phase 9 (Documentation) if test-generator produced 0 tests. Return to Phase 5 and generate tests for the implemented code.

Test Coverage Expectations

TierMinimum CoverageNotes
1. InterviewHappy path onlyFocus on correctness, not coverage
2. HackathonNone requiredTests are bonus
3. MVPUnit + 1 integrationCover critical paths
4-5. Growth/EnterpriseUnit + integration + e2eFull matrix above applies
6. Open SourceExhaustiveEvery public API must have tests

Test Runner Detection

# Detect test framework (PARALLEL)
Glob(pattern="**/jest.config*")
Glob(pattern="**/vitest.config*")
Glob(pattern="**/pytest.ini")
Glob(pattern="**/pyproject.toml")
Grep(pattern="\"test\":", glob="package.json")

Use the detected runner for all generated tests. Do not introduce a new test framework unless the project has none.

Tier Classification

Tier Classification & Workflow Mapping

Project complexity tiers determine architecture ceilings and workflow phases.

Auto-Detection Signals

Scan codebase for: README keywords (take-home, interview), .github/workflows/, Dockerfile, terraform/, k8s/, CONTRIBUTING.md.

Tier Classification

SignalTierArchitecture Ceiling
README says "take-home", time limit1. Interview (load $\{CLAUDE_SKILL_DIR\}/references/interview-mode.md)Flat files, 8-15 files
< 10 files, no CI2. HackathonSingle file if possible
.github/workflows/, managed DB3. MVPMVC monolith
Module boundaries, Redis, queues4. GrowthModular monolith, DI
K8s/Terraform, monorepo5. EnterpriseHexagonal/DDD
CONTRIBUTING.md, LICENSE6. Open SourceMinimal API, exhaustive tests

If confidence is low, use AskUserQuestion to ask the user. Pass detected tier to ALL downstream agents — see scope-appropriate-architecture.

Tier → Workflow Mapping

TierPhasesMax Agents
1. Interview1, 5 only2
2. Hackathon5 only1
3. MVP1-6, 93-4
4-5. Growth/EnterpriseAll 105-8
6. Open Source1-7, 9-103-4

Use AskUserQuestion to verify scope (full-stack / backend-only / frontend-only / prototype) and constraints.

Orchestration Mode

  • Agent Teams (mesh) when complexity >= 2.5 (GA since CC 2.1.33)
  • Task tool (star) otherwise; ORCHESTKIT_FORCE_TASK_TOOL=1 to override
  • Load orchestration modes: Read("$\{CLAUDE_SKILL_DIR\}/references/orchestration-modes.md")

Tier Override

When auto-detection is ambiguous (e.g., a monorepo with no CI yet), prefer the lower tier to avoid over-engineering. The user can always escalate.

Manual override example:

AskUserQuestion(questions=[{
  "question": "Detected signals for both MVP and Growth. Which tier fits best?",
  "options": [
    {"label": "3. MVP", "description": "MVC monolith, 3-4 agents"},
    {"label": "4. Growth", "description": "Modular monolith with DI, up to 8 agents"}
  ]
}])

Worktree Isolation Mode

Worktree Isolation Mode

When to Use

  • Feature touches 5+ files across multiple directories
  • Multiple developers working on same branch
  • Risky refactoring that may need rollback
  • Agent Teams mode with parallel agents editing overlapping files

Workflow

1. Enter Worktree

# CC 2.1.49: Native worktree support
EnterWorktree(name="feat-{feature-slug}")

This creates:

  • New branch feat-\{feature-slug\} from HEAD
  • Working directory at .claude/worktrees/feat-\{feature-slug\}/
  • Session CWD switches to the worktree automatically

2. Implement in Isolation

All implementation phases (4-8) run in the worktree. Benefits:

  • Main branch stays clean — no partial changes
  • Multiple agents can work without stepping on each other
  • Easy rollback: just delete the worktree branch

3. Merge Back

After Phase 8 (E2E Verification) passes:

# Return to original branch
git checkout {original-branch}

# Merge the feature
git merge feat-{feature-slug}

# Clean up worktree (prompted on session exit)

4. Conflict Resolution

If merge conflicts arise:

  1. Show conflicting files to user
  2. Present diff with AskUserQuestion for resolution choices
  3. Apply user's chosen resolution
  4. Re-run Phase 6 verification on merged result

Context Gate Integration

When running in a worktree, the context-gate SubagentStart hook raises concurrency limits:

  • MAX_CONCURRENT_BACKGROUND: 6 → 10 (worktree isolation reduces contention)
  • MAX_AGENTS_PER_RESPONSE: 8 → 12

This is safe because worktree agents operate on an isolated file tree.

Config Sharing (CC 2.1.63+)

Project configs and auto-memory are automatically shared across git worktrees. No manual copying needed:

  • .claude/settings.json — shared across all worktrees
  • .claude/memory/ — auto-memory persists across worktrees
  • CLAUDE.md — project instructions available in every worktree
  • Plugin configs — plugins discovered from any worktree

CLI Alternative

Users can also start worktrees manually:

claude --worktree    # or -w

This creates the worktree before the session starts, equivalent to EnterWorktree but at CLI level.

Limitations

  • Cannot nest worktrees (worktree inside worktree)
  • Session exit prompts to keep or remove the worktree
  • Some git operations (rebase, bisect) may behave differently in worktrees

Worktree Workflow

Git Worktree Workflow

Isolate feature work in dedicated worktrees for clean development and easy rollback.

When to Use Worktrees

ScenarioWorktree?Reason
Large feature (5+ files)YESIsolation prevents pollution
Experimental/risky changesYESEasy to discard entirely
Parallel feature developmentYESWork on multiple features
Hotfix while mid-featureYESDon't stash incomplete work
Quick bug fix (1-2 files)NoOverhead not worth it

Setup Commands

# Create worktree with new branch
git worktree add ../project-feature feature/feature-name

# Create worktree from existing branch
git worktree add ../project-feature existing-branch

# List all worktrees
git worktree list

# Navigate to worktree
cd ../project-feature

Workflow

# 1. Create worktree
git worktree add ../myapp-auth feature/user-auth

# 2. Work in isolation
cd ../myapp-auth
# ... make changes, commit normally ...

# 3. Merge back (from main worktree)
cd ../myapp
git checkout main
git merge feature/user-auth

# 4. Cleanup
git worktree remove ../myapp-auth
git branch -d feature/user-auth

Merge Strategies

StrategyWhen to Use
Merge commitDefault, preserves history
Squash mergeMany small commits, clean history wanted
Rebase firstLinear history preferred
# Squash merge (single commit)
git merge --squash feature/user-auth
git commit -m "feat: Add user authentication"

# Rebase then merge (linear)
cd ../myapp-auth
git rebase main
cd ../myapp
git merge feature/user-auth

Cleanup with Uncommitted Changes

# Check for uncommitted changes
cd ../myapp-auth
git status

# If changes exist, either:
# Option A: Commit them
git add . && git commit -m "WIP: save progress"

# Option B: Stash them
git stash push -m "feature-auth-wip"

# Option C: Discard (CAREFUL!)
git checkout -- .

# Then remove worktree
cd ../myapp
git worktree remove ../myapp-auth

Best Practices

  1. Naming: Use ../project-featurename pattern
  2. Short-lived: Merge within 1-3 days
  3. One feature per worktree: Don't mix concerns
  4. Regular sync: Rebase from main frequently
  5. Clean before remove: Always check git status

Checklists (1)

Implementation Review

Implementation Review Checklist

Use this checklist before marking implementation as complete.

Scope Verification

  • All acceptance criteria from micro-plan are met
  • No unplanned files were modified
  • No features were added beyond original scope
  • If scope changed, it was documented and justified

Code Quality

  • All tests pass
  • Type checking passes (mypy/tsc)
  • Linting passes (no warnings)
  • No TODO/FIXME left behind (or tracked in issues)

Testing Coverage

  • Unit tests for new functions/methods
  • Integration tests for API endpoints
  • Edge cases covered
  • Error paths tested

Documentation

  • Code comments for complex logic
  • API documentation updated (if endpoints added)
  • README updated (if setup changed)

Scope Creep Score

  • Score 0-2: Proceed
  • Score 3-5: Document additions in PR
  • Score 6+: Split into separate PR

Final Checks

  • PR description matches implementation
  • Commit messages are clear
  • No sensitive data committed
  • Works in development environment

Sign-off

Reviewer: _______________
Date: _______________
Scope Creep Score: ___/10
Ready to merge: [ ] Yes [ ] No - needs: _______________
Edit on GitHub

Last updated on

On this page

Implement FeatureQuick StartArgument ResolutionStep -1: MCP Probe + Resume CheckStep 0: Effort-Aware Phase Scaling (CC 2.1.76)Step 0a: Project Context DiscoveryWorktree Isolation (CC 2.1.49)Task Management (MANDATORY)Workflow (10 Phases)Phase Handoffs (CC 2.1.71)Progressive Output (CC 2.1.76+)Monitor Tool for Background Streaming (CC 2.1.98)Worktree-Isolated Implementation (CC 2.1.50)Post-Deploy Monitoring (CC 2.1.71)context7 with DetectionIssue TrackingFeedback LoopTest Requirements MatrixKey PrinciplesNext Steps (suggest to user after implementation)Agent CoordinationContext PassingSendMessage (Active Coordination)Skill ChainRelated SkillsReferencesRules (5)Subagents must only modify files within their assigned scope — prevent cross-agent conflicts — HIGHAgent Scope ContainmentProblemScope AssignmentShared File ProtocolKey RulesCommit after each logical milestone — never batch all commits to session end — HIGHCommit After MilestoneProblemCommit PointsCommit Message FormatKey RulesBlock completion if new code has zero test coverage — tests are mandatory for every implementation — HIGHTest Coverage RequirementProblemPer-Task EnforcementChange Type to Test MappingKey RulesMatch implementation tier to assessed complexity — never over-engineer a simple task — HIGHTier ValidationTier CeilingsProblemKey RulesAlways ExitWorktree after implementation — never leave orphaned worktrees — HIGHWorktree CleanupProblemVerificationKey RulesReferences (16)Agent PhasesAgent Phases Reference128K Output Token StrategyPhase 4: Architecture Design (5 Agents)Agent 1: Workflow ArchitectAgent 2: Backend ArchitectAgent 3: Frontend DeveloperAgent 4: LLM IntegratorPhase 4 — Teams ModePhase 5: Implementation (5 Agents)Agent 1: Backend — Complete ImplementationAgent 2: Frontend — Complete ImplementationAgent 3: AI Integration (if needed)Agent 4: Test Suite — Complete CoveragePhase 5 — Teams ModePhase 6: Integration Verification (4 Agents)Real-Service DetectionValidation CommandsAgent 1: Backend IntegrationAgent 2: Frontend IntegrationAgent 3: Code Quality ReviewAgent 4: Security AuditSecurity ChecksPhase 6 — Teams ModePhase 7: Scope Creep DetectionPhase 8: E2E VerificationPhase 9: DocumentationPhase 10: Post-Implementation Reflection & CleanupWorktree Cleanup (CC 2.1.72)ReflectionAgent Teams Full StackAgent Teams: Full-Stack Feature PipelineTeam FormationTeam Name PatternTeammate Spawn Prompts1. backend-architect (backend-system-architect)2. frontend-dev (frontend-ui-developer)3. test-engineer (test-generator)4. code-reviewer (code-quality-reviewer)Coordination Messaging TemplatesBackend → Frontend: API Contract HandoffFrontend → Test Engineer: Component Interface HandoffAny → Lead: Blocked NotificationPer-Teammate Worktree SetupLead Synthesis ProtocolCost ComparisonWhen to UseAgent Teams PhasesAgent Teams Phase AlternativesPhase 4 — Agent Teams Architecture DesignPhase 5 — Agent Teams ImplementationPhase 6 — Agent Teams IntegrationPhase 6b — Team Teardown (Agent Teams Only)1. Merge Worktrees (if used)2. Shut Down Teammates3. Clean UpAgent Teams Security AuditAgent Teams: Security Audit PipelineTeam FormationTeam Name PatternTeammate Spawn Prompts1. security-auditor (OWASP + Dependencies)2. security-layer-auditor (Defense-in-Depth)3. ai-safety-auditor (LLM Security)Lead Synthesis ProtocolCost Comparison BaselineWhen to UseCc EnhancementsCC 2.1.30+ EnhancementsTask MetricsTool Usage Guidance (CC 2.1.31)Session Resume Hints (CC 2.1.31)E2e VerificationE2E Verification GuideBrowser Testing (UI features)API Testing (Backend features)Full-Stack VerificationWhat to CheckWhen to SkipFeedback LoopContinuous Feedback LoopAfter Each Task CompletionFeedback TriggersInterview ModeInterview / Take-Home ModeConstraintsREADME TemplateMicro Planning GuideMicro-Planning GuideWhat to IncludePlanning ProcessStep 1: Define Scope BoundariesStep 2: List Files ExplicitlyStep 3: Set Acceptance CriteriaTime-Boxing TechniquesAt Break PointWhen to Break Down FurtherAnti-PatternsOrchestration ModesOrchestration Mode SelectionDecision LogicComparison TableFallbackScope Creep DetectionScope Creep DetectionWarning SignsDetection ChecklistFiles Changed vs PlannedFeatures Added vs PlannedTime Spent vs EstimatedQuick Audit CommandScope Creep ScoreRecovery StrategiesIf Score 3-5 (Moderate)If Score 6-8 (Significant)If Score 9-10 (Major)Prevention TipsTeam Worktree SetupTeam Worktree SetupBranch Naming ConventionWorktree Setup CommandsTeammate AssignmentMerge StrategySquash Merge Per Role (Recommended)Handling Merge ConflictsCleanupWhen to Skip WorktreesConfig Sharing (CC 2.1.63+)Worktree + Agent Teams ChecklistTest Requirements MatrixTest Requirements MatrixRequired Tests by Change TypeReal-Service Detection (Phase 6)Phase 9 GateTest Coverage ExpectationsTest Runner DetectionTier ClassificationTier Classification & Workflow MappingAuto-Detection SignalsTier ClassificationTier → Workflow MappingOrchestration ModeTier OverrideWorktree Isolation ModeWorktree Isolation ModeWhen to UseWorkflow1. Enter Worktree2. Implement in Isolation3. Merge Back4. Conflict ResolutionContext Gate IntegrationConfig Sharing (CC 2.1.63+)CLI AlternativeLimitationsWorktree WorkflowGit Worktree WorkflowWhen to Use WorktreesSetup CommandsWorkflowMerge StrategiesCleanup with Uncommitted ChangesBest PracticesChecklists (1)Implementation ReviewImplementation Review ChecklistScope VerificationCode QualityTesting CoverageDocumentationScope Creep ScoreFinal ChecksSign-off