Fix Issue
Fixes GitHub issues with parallel analysis. Use to debug errors, resolve regressions, fix bugs, or triage issues.
Related Skills
Fix Issue
Systematic issue resolution with hypothesis-based root cause analysis, similar issue detection, and prevention recommendations.
Quick Start
/ork:fix-issue 123
/ork:fix-issue 456Opus 4.6: Root cause analysis uses native adaptive thinking. Dynamic token budgets scale with context window for thorough investigation.
STEP 0: Verify User Intent
BEFORE creating tasks, clarify fix approach using AskUserQuestion. See rules/evidence-gathering.md for the full prompt template and workflow adjustments per approach (Proper fix, Quick fix, Investigate first, Hotfix).
STEP 0b: Select Orchestration Mode
Choose Agent Teams (mesh) or Task tool (star). See references/agent-selection.md for the selection criteria, cost comparison, and task creation patterns.
Workflow Overview
| Phase | Activities | Output |
|---|---|---|
| 1. Understand Issue | Read GitHub issue details | Problem statement |
| 2. Similar Issue Detection | Search for related past issues | Related issues list |
| 3. Hypothesis Formation | Form hypotheses with confidence scores | Ranked hypotheses |
| 4. Root Cause Analysis | 5 parallel agents investigate | Confirmed root cause |
| 5. Fix Design | Design approach based on RCA | Fix specification |
| 6. Implementation | Apply fix with tests | Working code |
| 7. Validation | Verify fix resolves issue | Evidence |
| 8. Prevention | How to prevent recurrence | Prevention plan |
| 9. Runbook | Create/update runbook entry | Runbook |
| 10. Lessons Learned | Capture knowledge | Persisted learnings |
| 11. Commit and PR | Create PR with fix | Merged PR |
Full phase details: See references/fix-phases.md for bash commands, templates, and procedures for each phase.
Critical Constraints
- Feature branch MANDATORY -- NEVER commit directly to main or dev
- Regression test MANDATORY -- write failing test BEFORE implementing fix
- Prevention required -- at least one of: automated test, validation rule, or process check
- Make minimal, focused changes; DO NOT over-engineer
CC 2.1.49 Enhancements
See references/cc-enhancements.md for session resume, task metrics, tool guidance, worktree isolation, and adaptive thinking.
Rules Quick Reference
| Rule | Impact | What It Covers |
|---|---|---|
| evidence-gathering | HIGH | User intent verification, confidence scale, key decisions |
| rca-five-whys | HIGH | 5 Whys iterative causal analysis |
| rca-fishbone | MEDIUM | Ishikawa diagram, multi-factor analysis |
| rca-fault-tree | MEDIUM | Fault tree analysis, AND/OR gates, critical systems |
Related Skills
ork:commit- Commit issue fixesdebug-investigator- Debug complex issuesork:issue-progress-tracking- Auto-updates from commitsork:remember- Store lessons learned
References
- Fix Phases
- Agent Selection
- Similar Issue Search
- Hypothesis-Based RCA
- Agent Teams RCA
- Prevention Patterns
- CC Enhancements
Version: 2.1.0 (February 2026)
Rules (4)
Evidence Gathering — HIGH
Evidence Gathering Patterns
Verify User Intent (STEP 0)
BEFORE creating tasks, clarify fix approach with AskUserQuestion:
AskUserQuestion(
questions=[{
"question": "What approach for this fix?",
"header": "Approach",
"options": [
{"label": "Proper fix (Recommended)", "description": "Full RCA, tests, prevention recommendations"},
{"label": "Quick fix", "description": "Minimal fix to resolve the immediate issue"},
{"label": "Investigate first", "description": "Understand the issue before deciding on approach"},
{"label": "Hotfix", "description": "Emergency patch, minimal testing"}
],
"multiSelect": false
}]
)Based on answer, adjust workflow:
- Proper fix: All 11 phases, parallel agents for RCA
- Quick fix: Skip phases 8-10 (prevention, runbook, lessons)
- Investigate first: Only phases 1-4 (understand, search, hypotheses, analyze)
- Hotfix: Minimal phases, skip similar issue search
Hypothesis Confidence Scale
| Confidence | Meaning |
|---|---|
| 90-100% | Near certain |
| 70-89% | Highly likely |
| 50-69% | Probable |
| 30-49% | Possible |
| 0-29% | Unlikely |
Key Decisions
| Decision | Choice | Rationale |
|---|---|---|
| Feature branch | MANDATORY | Never commit to main/dev directly |
| Regression test | MANDATORY | Fix without test is incomplete |
| Hypothesis confidence | 0-100% scale | Quantifies certainty |
| Similar issue search | Before hypothesis | Leverage past solutions |
| Prevention analysis | Mandatory phase | Break recurring issue cycle |
| Runbook generation | Template-based | Consistent documentation |
Map all failure paths with fault tree analysis to prevent recurring system failures — MEDIUM
Fault Tree Analysis (FTA)
Top-down, deductive analysis mapping all paths to a failure using boolean logic (AND/OR gates). Best for critical systems and exhaustive failure analysis.
FTA Symbols
| Symbol | Meaning |
|---|---|
| TOP | Top event — the failure being analyzed |
| AND | All inputs must occur for output |
| OR | Any input causes output |
| Basic Event | Root cause (leaf node) |
| Undeveloped | Needs further analysis |
Example: Authentication Failure
USER CANNOT
AUTHENTICATE
|
[OR]
+------------+------------+
| | |
Invalid Auth Service Account
Credentials Down Locked
| |
[OR] [OR]
+---+---+ +---+---+
| | | | | |
Wrong Expired Token DB Redis External
Pass Token Invalid Down Down AuthBuilding a Fault Tree
- Define top event — the failure to analyze
- Ask "what causes this?" — list immediate causes
- Classify as AND/OR — do ALL causes need to happen, or ANY one?
- Decompose each cause — repeat until reaching basic events
- Identify minimal cut sets — smallest combinations that cause failure
- Prioritize by probability — most likely paths first
Minimal Cut Sets
The smallest set of basic events that together cause the top event:
Top: User Cannot Authenticate (OR gate)
Cut Set 1: {Wrong Password} — single point of failure
Cut Set 2: {Expired Token} — single point of failure
Cut Set 3: {DB Down} — single point of failure
Cut Set 4: {Account Locked} — single point of failureSingle-event cut sets indicate no redundancy — add defense-in-depth.
When to Use FTA
| Scenario | Use FTA? |
|---|---|
| Safety-critical system failure | Yes |
| Need exhaustive failure path mapping | Yes |
| Complex multi-component failure | Yes |
| Simple linear bug | No — use 5 Whys |
| Multiple contributing factors | Maybe — Fishbone first |
| Regulatory compliance analysis | Yes |
| Post-incident for serious outages | Yes |
Incorrect — stopping at high-level causes without decomposition:
USER CANNOT AUTHENTICATE
|
[OR]
+----+----+
| |
Auth Service Account
Down LockedCorrect — decompose to basic events with AND/OR gates:
USER CANNOT
AUTHENTICATE
|
[OR]
+------------+------------+
| | |
Invalid Auth Service Account
Credentials Down Locked
| |
[OR] [OR]
+---+---+ +---+---+
| | | | | |
Wrong Expired Token DB Redis External
Pass Token Invalid Down Down Auth
Minimal Cut Sets identified:
{Wrong Password}, {Expired Token}, {DB Down}, {Account Locked}
→ All single-event cuts = no redundancy, needs defense-in-depthKey Rules
- Start from the top event (failure) and work downward
- Every gate must be classified as AND (all required) or OR (any sufficient)
- Decompose until reaching basic events (actionable root causes)
- Identify minimal cut sets to find the most vulnerable paths
- Single-event cut sets indicate missing redundancy
- Use for critical systems where exhaustive analysis is justified
Analyze multi-factor problems with fishbone diagrams to avoid single-cause fixation — MEDIUM
Fishbone Diagram (Ishikawa)
Visualize multiple potential causes organized by category. Best for problems with several contributing factors.
Software-Specific Categories
+-------------+
Code -----+ |
| |
Infrastructure ----+ +---- BUG/INCIDENT
| |
Dependencies ----+ |
| |
Configuration ---+ |
| |
Process ----+ |
| |
People -----+ |
+-------------+Example: API Latency Spike
| Category | Potential Causes |
|---|---|
| Code | N+1 query, missing index, sync blocking call |
| Infrastructure | DB connection pool exhausted, network saturation, insufficient RAM |
| Dependencies | External API slow, Redis timeout, CDN cache miss |
| Configuration | Wrong pool size, missing timeout, debug logging on |
| Process | No load testing, no perf regression CI |
| People | Unfamiliarity with query optimizer, missing review |
Fishbone Process
- Define the problem clearly (the fish head)
- Identify major categories (the bones) — use software categories above
- Brainstorm causes for each category
- Analyze relationships between causes across categories
- Prioritize most likely root causes by evidence
- Verify with data, metrics, or targeted testing
- Take action on confirmed causes
When to Use Fishbone
| Scenario | Use Fishbone? |
|---|---|
| Multiple things went wrong | Yes |
| Problem has one clear cause | No — use 5 Whys |
| Team brainstorming session | Yes |
| Safety-critical failure analysis | No — use Fault Tree |
| Recurring issue with no clear pattern | Yes |
Incorrect — jumping to one cause without category analysis:
### API Latency Spike Analysis
**Root Cause:** N+1 query in user endpoint
**Fix:** Add query optimizationCorrect — fishbone analysis across all categories:
### API Latency Spike — Fishbone Analysis
**Code:**
- N+1 query in user endpoint (CONFIRMED via query log)
- Sync blocking call to external API
**Infrastructure:**
- DB connection pool exhausted (CONFIRMED: 0 available connections)
- Network saturation (ruled out: 20% utilization)
**Dependencies:**
- Redis timeout increased (ruled out: within SLA)
**Configuration:**
- Connection pool size too small (CONFIRMED: 10 max, need 50)
**Process:**
- No load testing in CI (process gap)
**Root Causes (cross-category):**
1. N+1 query (Code) + small pool (Config) = exhaustion
2. Missing load tests (Process) = undetected before prod
**Actions:**
- Fix N+1 query immediately
- Increase pool size 10 → 50
- Add load tests to CIKey Rules
- Use software-specific categories (Code, Infrastructure, Dependencies, Configuration, Process, People)
- Brainstorm causes per category before analyzing relationships
- Look for cross-category interactions (e.g., code + config)
- Prioritize by evidence, not by assumption
- Verify top candidates with data or experiments before committing to a fix
Apply the 5 Whys technique to reach root causes instead of fixing symptoms — HIGH
5 Whys Technique
Iteratively ask "why" to drill down from symptom to root cause. Simple, fast, and effective for linear causal chains.
Process
Problem Statement: [Clear description of the issue]
|
v
Why #1: [First level cause]
|
v
Why #2: [Deeper cause]
|
v
Why #3: [Even deeper]
|
v
Why #4: [Getting to root]
|
v
Why #5: [Root cause identified]
|
v
Action: [Fix that addresses root cause]Example: Production Outage
**Problem:** Website was down for 2 hours
**Why 1:** The application server ran out of memory and crashed.
**Why 2:** A memory leak in the image processing service accumulated over time.
**Why 3:** The service wasn't releasing image buffers after processing.
**Why 4:** The cleanup code had a bug introduced in last week's release.
**Why 5:** We don't have automated memory leak detection in our test suite.
**Root Cause:** Missing automated memory leak testing
**Action:** Add memory profiling to CI pipeline, add cleanup testsBest Practices
| Do | Don't |
|---|---|
| Base answers on evidence | Guess or assume |
| Stay focused on one causal chain | Branch too early |
| Keep asking until actionable | Stop at symptoms |
| Involve people closest to issue | Assign blame |
| Document your reasoning | Skip steps |
When 5 Whys Falls Short
- Multiple contributing factors — use Fishbone diagram instead
- Complex system interactions — use Fault Tree Analysis
- Organizational/process issues — needs broader systemic analysis
- Concurrent failures — 5 Whys assumes linear causation
Incorrect — stopping at symptom without root cause:
**Problem:** Website was down for 2 hours
**Why 1:** The application server crashed.
**Action:** Restart the serverCorrect — drilling down to root cause with 5 Whys:
**Problem:** Website was down for 2 hours
**Why 1:** The application server ran out of memory and crashed.
Evidence: Out-of-memory error in logs
**Why 2:** A memory leak in the image processing service accumulated over time.
Evidence: Memory usage increased 2GB/hour in metrics
**Why 3:** The service wasn't releasing image buffers after processing.
Evidence: Code review shows missing .dispose() calls
**Why 4:** The cleanup code had a bug introduced in last week's release.
Evidence: Git blame + diff shows removal of cleanup in PR #234
**Why 5:** We don't have automated memory leak detection in our test suite.
Evidence: No memory profiling in CI pipeline
**Root Cause:** Missing automated memory leak testing
**Actions:**
- Add memory profiling to CI pipeline
- Add cleanup tests for image processing
- Revert PR #234's cleanup removalKey Rules
- Always start with a clear, specific problem statement
- Each "why" must be supported by evidence (logs, metrics, code)
- Stop when you reach an actionable root cause (not always exactly 5)
- The fix should address the root cause, not the symptom
- Document the full chain for knowledge sharing
References (7)
Agent Selection
Agent Selection & Orchestration Mode
Orchestration Mode Selection
Choose Agent Teams (mesh -- RCA agents share hypotheses) or Task tool (star -- all report to lead):
CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1-> Agent Teams mode- Agent Teams unavailable -> Task tool mode (default)
- Otherwise: Complex cross-cutting bugs (backend + frontend + tests involved) -> recommend Agent Teams; Focused bugs (single domain) -> Task tool
| Aspect | Task Tool | Agent Teams |
|---|---|---|
| Hypothesis sharing | Lead relays between agents | Investigators share hypotheses in real-time |
| Conflicting evidence | Lead resolves | Investigators debate directly |
| Cost | ~250K tokens | ~600K tokens |
| Best for | Single-domain bugs | Cross-cutting bugs with multiple hypotheses |
Fallback: If Agent Teams encounters issues, fall back to Task tool for remaining investigation.
RCA Agent Roster (Phase 4)
Launch ALL 5 agents in parallel with run_in_background=True and max_turns=25:
| # | Agent | Role |
|---|---|---|
| 1 | debug-investigator | Root cause tracing |
| 2 | debug-investigator | Impact analysis |
| 3 | backend-system-architect | Backend fix design |
| 4 | frontend-ui-developer | Frontend fix design |
| 5 | test-generator | Test requirements |
Each agent outputs structured JSON with findings and SUMMARY line.
Task Management (CC 2.1.16)
# Create main fix task
TaskCreate(
subject="Fix issue #{number}",
description="Systematic issue resolution with hypothesis-based RCA",
activeForm="Fixing issue #{number}"
)
# Create subtasks for 11-phase process
phases = ["Understand issue", "Search similar issues", "Form hypotheses",
"Analyze root cause", "Design fix", "Implement fix", "Validate fix",
"Generate prevention", "Create runbook", "Capture lessons", "Commit and PR"]
for phase in phases:
TaskCreate(subject=phase, activeForm=f"{phase}ing")Agent Teams Rca
Agent Teams RCA Workflow
In Agent Teams mode, form an investigation team where RCA agents share hypotheses and evidence in real-time:
TeamCreate(team_name="fix-issue-{number}", description="RCA for issue #{number}")
Task(subagent_type="debug-investigator", name="root-cause-tracer",
team_name="fix-issue-{number}",
prompt="""Trace the root cause for issue #{number}: {issue description}
Hypotheses: {hypothesis list from Phase 3}
Test each hypothesis. When you find evidence supporting or refuting a hypothesis,
message impact-analyst and the relevant domain expert (backend-expert or frontend-expert).
If you find conflicting evidence, share it with ALL teammates for debate.""")
Task(subagent_type="debug-investigator", name="impact-analyst",
team_name="fix-issue-{number}",
prompt="""Analyze the impact and blast radius for issue #{number}.
When root-cause-tracer shares evidence, assess how many code paths are affected.
Message test-planner with affected paths so they can plan regression tests.
If the impact is larger than expected, message the lead immediately.""")
Task(subagent_type="backend-system-architect", name="backend-expert",
team_name="fix-issue-{number}",
prompt="""Investigate backend aspects of issue #{number}.
When root-cause-tracer shares backend-related hypotheses, design the fix approach.
Message frontend-expert if the fix affects API contracts.
Share fix design with test-planner for test requirements.""")
Task(subagent_type="frontend-ui-developer", name="frontend-expert",
team_name="fix-issue-{number}",
prompt="""Investigate frontend aspects of issue #{number}.
When root-cause-tracer shares frontend-related hypotheses, design the fix approach.
If backend-expert changes API contracts, adapt the frontend fix accordingly.
Share component changes with test-planner.""")
Task(subagent_type="test-generator", name="test-planner",
team_name="fix-issue-{number}",
prompt="""Plan regression tests for issue #{number}.
When root-cause-tracer confirms the root cause, write a failing test that reproduces it.
When backend-expert or frontend-expert share fix designs, plan verification tests.
Start with the regression test BEFORE the fix is applied (TDD approach).""")Team teardown after fix is implemented and validated:
SendMessage(type="shutdown_request", recipient="root-cause-tracer", content="Fix validated")
SendMessage(type="shutdown_request", recipient="impact-analyst", content="Fix validated")
SendMessage(type="shutdown_request", recipient="backend-expert", content="Fix validated")
SendMessage(type="shutdown_request", recipient="frontend-expert", content="Fix validated")
SendMessage(type="shutdown_request", recipient="test-planner", content="Fix validated")
TeamDelete()Fallback: If team formation fails, use standard Phase 4 Task spawns.
Cc Enhancements
CC 2.1.27+ Enhancements for Fix Issue
Session Resume with PR Context
When you create a PR for the fix, the session is automatically linked:
# Later: Resume with full PR context
claude --from-pr 789Task Metrics (CC 2.1.30)
Track RCA efficiency across the 5 parallel agents:
## Phase 4 Metrics (Root Cause Analysis)
| Agent | Tokens | Tools | Duration |
|-------|--------|-------|----------|
| debug-investigator #1 | 520 | 12 | 18s |
| debug-investigator #2 | 480 | 10 | 15s |
| backend-system-architect | 390 | 8 | 12s |
**Root cause found in:** 45s totalTool Guidance (CC 2.1.31)
When investigating root cause:
| Task | Use | Avoid |
|---|---|---|
| Read logs/files | Read(file_path=...) | bash cat |
| Search for errors | Grep(pattern="ERROR") | bash grep |
| Find affected files | Glob(pattern="**/*.py") | bash find |
| Check git history | Bash git log/diff | (git needs bash) |
Session Resume Hints (CC 2.1.31)
Before ending fix sessions, capture investigation context:
/ork:remember Issue #$ARGUMENTS RCA findings:
Root cause: [one line]
Confirmed by: [key evidence]
Fix status: [implemented/pending]
Prevention: [recommendation]Resume later:
claude # Shows resume hint
/ork:memory search "issue $ARGUMENTS" # Loads your findingsFix Phases
Fix Issue: 11-Phase Workflow
Detailed procedures for each phase of the fix-issue workflow.
Phase 1: Understand the Issue
gh issue view $ARGUMENTS --json title,body,labels,assignees,comments
gh pr list --search "issue:$ARGUMENTS"
gh issue view $ARGUMENTS --commentsStart Work ceremony (from issue-progress-tracking): move issue to in-progress, comment on issue, ensure branch is named issue/N-description.
Phase 2: Similar Issue Detection
See Similar Issue Search for patterns.
gh issue list --search "[key error message]" --state all
mcp__memory__search_nodes(query="issue [error type] fix")| Similar Issue | Similarity | Status | Relevant? |
|---|---|---|---|
| #101 | 85% | Closed | Yes |
Determine: Regression? Variant? New issue?
Phase 3: Hypothesis Formation
See Hypothesis-Based RCA for confidence scoring.
## Hypothesis 1: [Brief name]
**Confidence:** [0-100]%
**Description:** [What might cause the issue]
**Test:** [How to verify]| Confidence | Meaning |
|---|---|
| 90-100% | Near certain |
| 70-89% | Highly likely |
| 50-69% | Probable |
| 30-49% | Possible |
| 0-29% | Unlikely |
Phase 4: Root Cause Analysis (5 Agents)
Launch ALL 5 agents in parallel with run_in_background=True and max_turns=25:
- debug-investigator: Root cause tracing
- debug-investigator: Impact analysis
- backend-system-architect: Backend fix design
- frontend-ui-developer: Frontend fix design
- test-generator: Test requirements
Each agent outputs structured JSON with findings and SUMMARY line.
Agent Teams Alternative
See agent-teams-rca.md for Agent Teams root cause analysis workflow.
Phase 5: Fix Design
## Fix Design for Issue #$ARGUMENTS
### Root Cause (Confirmed)
[Description]
### Proposed Fix
[Approach]
### Files to Modify
| File | Change | Reason |
|------|--------|--------|
| [file] | MODIFY | [why] |
### Risks
- [Risk 1]
### Rollback Plan
[How to revert]Phase 6: Implementation
CRITICAL: Feature Branch Required
NEVER commit directly to main or dev. Always create a feature branch:
# Determine base branch
BASE_BRANCH=$(git remote show origin | grep 'HEAD branch' | cut -d: -f2 | tr -d ' ')
# Create feature branch (MANDATORY)
git checkout $BASE_BRANCH && git pull origin $BASE_BRANCH
git checkout -b issue/$ARGUMENTS-fixCRITICAL: Regression Test Required
A fix without a test is incomplete. Add test BEFORE implementing fix:
# 1. Write test that reproduces the bug (should FAIL)
# 2. Implement the fix
# 3. Verify test now PASSESGuidelines:
- Make minimal, focused changes
- Add proper error handling
- Add regression test FIRST (MANDATORY)
- DO NOT over-engineer
- DO NOT commit directly to protected branches
Phase 7: Validation
# Backend
poetry run ruff format --check app/
poetry run pytest tests/unit/ -v --tb=short
# Frontend
npm run lint && npm run typecheck && npm run testPhase 8: Prevention Recommendations
CRITICAL: Prevention must include at least one of:
- Automated test - CI catches similar issues (PREFERRED)
- Validation rule - Schema/lint rule prevents bad state
- Process check - Review checklist item
See Prevention Patterns for full template.
| Category | Examples | Effectiveness |
|---|---|---|
| Automated test | Unit/integration test in CI | HIGH - catches before merge |
| Validation rule | Schema check, lint rule | HIGH - catches on save/commit |
| Architecture | Better error boundaries | MEDIUM |
| Process | Review checklist item | LOW - human-dependent |
Phase 9: Runbook Generation
# Runbook: [Issue Type]
## Symptoms
- [Observable symptom]
## Diagnosis Steps
1. Check [X] by running: `[command]`
## Resolution Steps
1. [Step 1]
## Prevention
- [How to prevent]Store in memory for future reference.
Phase 10: Lessons Learned
mcp__memory__create_entities(entities=[{
"name": "lessons-issue-$ARGUMENTS",
"entityType": "LessonsLearned",
"observations": [
"root_cause: [brief]",
"key_learning: [most important]",
"prevention: [recommendation]"
]
}])Phase 11: Commit and PR
git add .
git commit -m "fix(#$ARGUMENTS): [Brief description]
Root cause: [one line]
Prevention: [recommendation]"
git push -u origin issue/$ARGUMENTS-fix
gh pr create --base dev --title "fix(#$ARGUMENTS): [description]"Hypothesis Rca
Hypothesis-Based Root Cause Analysis
Scientific method for identifying root causes with quantified confidence.
The Scientific Method for RCA
1. Observe symptoms
2. Form hypotheses
3. Gather evidence
4. Test hypotheses
5. Confirm or reject
6. Repeat until root cause foundHypothesis Template
## Hypothesis: [Brief name]
**Confidence:** [0-100]%
**Description:**
[What might be causing the issue]
**Evidence For:**
- [Supporting evidence 1]
- [Supporting evidence 2]
**Evidence Against:**
- [Contradicting evidence 1]
**Test Plan:**
1. [Step to verify/refute]
2. [Expected outcome if true]Confidence Score Guidelines
| Score | Meaning | Evidence Required |
|---|---|---|
| 90-100% | Near certain | Reproduction + multiple strong evidence |
| 70-89% | Highly likely | Clear evidence, logical chain |
| 50-69% | Probable | Some evidence, plausible mechanism |
| 30-49% | Possible | Limited evidence, needs investigation |
| 0-29% | Unlikely | Weak evidence, backup hypothesis |
Evidence Classification
| Type | Weight | Examples |
|---|---|---|
| Reproduction | +30% | Consistent reproduction steps |
| Code trace | +20% | Stack trace to specific line |
| Timing correlation | +15% | Issue appeared after deployment X |
| Log evidence | +15% | Error messages match hypothesis |
| Similar patterns | +10% | Same error in related code |
| User report | +5% | Consistent user descriptions |
Contradicting Evidence
| Evidence | Weight |
|---|---|
| Hypothesis disproven by test | -40% |
| Works in same conditions | -25% |
| Unrelated timing | -15% |
| No supporting logs | -10% |
Multiple Hypothesis Comparison
| Hypothesis | Initial | After Test | Status |
|------------|---------|------------|--------|
| Race condition | 65% | 85% | INVESTIGATING |
| Null reference | 40% | 15% | REJECTED |
| Cache stale | 30% | 30% | ON HOLD |Best Practices
- Start with 3+ hypotheses - Avoid tunnel vision
- Test highest confidence first - Efficient investigation
- Update scores after each test - Track progress
- Document rejected hypotheses - Prevent repeated investigation
- Look for evidence against - Avoid confirmation bias
Prevention Patterns
Prevention Patterns
Strategies to prevent issue recurrence by category.
Code-Level Prevention
| Issue Type | Prevention Pattern |
|---|---|
| Null/undefined | Optional chaining, nullish coalescing |
| Type errors | Strict TypeScript, runtime validation |
| Input validation | Zod schemas at boundaries |
| Error handling | Result types, explicit error states |
| Race conditions | Locks, atomic operations, idempotency |
| Memory leaks | Cleanup in useEffect, WeakRef |
// Before: Vulnerable
const name = user.profile.name;
// After: Defensive
const name = user?.profile?.name ?? 'Unknown';Architecture-Level Prevention
| Issue Type | Prevention Pattern |
|---|---|
| Cascading failures | Circuit breakers |
| Network instability | Retry with backoff |
| Data inconsistency | Transactions, saga pattern |
| Timeout issues | Request deadlines, cancellation |
| Resource exhaustion | Rate limiting, pooling |
# Circuit breaker example
@circuit_breaker(failure_threshold=5, recovery_timeout=30)
async def external_api_call():
...Process-Level Prevention
| Issue Type | Prevention Pattern |
|---|---|
| Logic errors | Mandatory PR review |
| Missing tests | Coverage requirements (>80%) |
| Regression | Required regression test before fix |
| Knowledge gaps | ADR for decisions |
| Onboarding issues | Runbook documentation |
Tooling-Level Prevention
| Issue Type | Prevention Pattern |
|---|---|
| Style issues | ESLint/Ruff rules |
| Type errors | Pre-commit type check |
| Security vulnerabilities | Dependency scanning in CI |
| Format inconsistency | Auto-format on save |
| Secrets in code | Pre-commit secret detection |
# .pre-commit-config.yaml
- repo: local
hooks:
- id: type-check
name: TypeScript check
entry: npx tsc --noEmit
language: systemPrevention Priority Matrix
| Effort | Impact | Priority |
|---|---|---|
| Low | High | Immediate |
| Low | Low | Backlog |
| High | High | Sprint planning |
| High | Low | Skip |
Similar Issue Search
Similar Issue Search
Find related past issues to leverage previous solutions and detect regressions.
GitHub Issue Search Patterns
# Search by error message
gh issue list --search "TypeError: Cannot read property" --state all
# Search by component/file
gh issue list --search "UserService" --state all --json number,title,state
# Search by label
gh issue list --label "bug" --state closed --limit 20
# Combined search
gh issue list --search "auth login 401" --state all --json number,title,closedAtMemory/Knowledge Graph Queries
# Search for past fixes
mcp__memory__search_nodes(query="fix authentication error")
# Search by error type
mcp__memory__search_nodes(query="TypeError resolution")
# Search by component
mcp__memory__search_nodes(query="UserService bug")Stack Trace Similarity Matching
Match by:
- Exception type - Same error class
- File/line - Same code location
- Call stack depth - Similar execution path
- Error message pattern - Regex match on message
Similarity Assessment Criteria
| Factor | Weight | High Match |
|---|---|---|
| Same exception type | 30% | Exact match |
| Same file | 25% | Same file involved |
| Similar error message | 20% | >80% string similarity |
| Same component | 15% | Same service/module |
| Recent (< 30 days) | 10% | Recently resolved |
When to Reuse vs Investigate Fresh
Reuse Previous Solution When:
- Similarity > 80%
- Same root cause confirmed
- Fix is still applicable
- No code changes since fix
Investigate Fresh When:
- Similarity < 60%
- Context has changed significantly
- Previous fix may be incomplete
- New dependencies involved
Issue Classification
| Type | Action |
|---|---|
| Regression | Same issue, fix reverted or bypassed |
| Variant | Similar pattern, different trigger |
| New | No similar issues found |
Checklists (1)
Fix Complete Checklist
Fix Complete Checklist
Verify all aspects of issue resolution before closing.
Root Cause Analysis
- Root cause identified with confidence >= 70%
- Hypotheses documented (at least 2 considered)
- Evidence for/against documented
- Similar issues checked
Fix Verification
- Regression test added
- All existing tests pass
- Fix manually verified
- Edge cases covered
Prevention
- Prevention recommendation documented
- At least one prevention measure implemented or ticketed
- Runbook entry created/updated
Knowledge Capture
- Lessons learned stored in memory
- RCA report generated (for high/critical issues)
- Related issues linked
PR/Commit
- Commit message includes issue number
- Commit message describes root cause
- PR links to issue with "Fixes #N"
Final Verification
# Quick verification commands
git log -1 --oneline # Check commit message
gh pr checks # Check CI status
gh issue view [N] # Verify issue linkedFeedback
Manages OrchestKit usage analytics, learning preferences, and privacy settings. Use when reviewing patterns, pausing learning, or managing consent.
Git Workflow
Complete git workflow patterns including GitHub Flow branching, atomic commits with interactive staging, merge and rebase strategies, and recovery operations using reflog. Essential patterns for clean history. Use when managing branches, defining branching strategy, or recovering git history.
Last updated on