Review a Pull Request
AI-powered code review with 6 parallel specialized agents that catch security, performance, and quality issues.
The /ork:review-pr command runs a multi-agent code review against any pull request. Six specialized agents analyze the diff in parallel, each focused on a different dimension of code quality. The results are synthesized into a single, actionable PR comment. This cookbook walks through reviewing a payment processing PR.
Scenario
A teammate opens PR #42 that adds a Stripe payment processing endpoint to your FastAPI backend. The PR is 420 lines across 9 files: a new router, service layer, webhook handler, database migration, and tests. You want a thorough review before merging code that handles real money.
What You'll Use
| Component | Type | Role |
|---|---|---|
/ork:review-pr | Command skill | Orchestrates the multi-agent review |
code-quality-reviewer | Agent | Style, patterns, complexity |
security-auditor | Agent | Injection, XSS, secrets, OWASP |
test-generator | Agent | Missing test coverage |
performance-engineer | Agent | N+1 queries, bundle size, latency |
backend-system-architect | Agent | API design, error handling, contracts |
accessibility-specialist | Agent | A11y issues (if frontend changes present) |
pr-size-warning | Hook | Detects large PRs and warns about review difficulty |
security-command-audit | Hook | Logs security findings |
skill-suggester | Hook | Injects payment and Stripe reference skills |
Step 1: Start the Review
/ork:review-pr 42OrchestKit fetches the PR metadata from GitHub using gh pr view 42 and asks for your review focus:
PR #42: "Add Stripe payment processing endpoint"
Author: @teammate | +312 / -108 | 9 files | base: main
Review focus:
[1] Full — All 6 agents: security, quality, tests, performance, design, a11y
[2] Security — Deep security-only scan (2 agents)
[3] Performance — Latency, queries, resource usage
[4] Quick — Quality + tests only (fast, 2 agents)
> 1Selecting Full launches all six agents against the PR diff.
Step 2: Six Parallel Agents Analyze the Diff
/ork:review-pr 42 (Full)
|
| Fetches diff: gh pr diff 42
| Fetches files: 9 changed files
|
|---> code-quality-reviewer ----------------------------+
| Skills: code-review-playbook, clean-architecture |
| Focus: naming, complexity, DRY, patterns |
| |
|---> security-auditor ---------------------------------+|
| Skills: owasp-top-10, defense-in-depth, ||
| input-validation, security-scanning ||
| Focus: injection, auth bypass, exposed secrets ||
| ||
|---> test-generator -----------------------------------+||
| Skills: pytest-advanced, integration-testing |||
| Focus: missing coverage, edge cases, mocks |||
| |||
|---> performance-engineer ----------------------------+|||
| Skills: performance-optimization, caching-strategies ||||
| Focus: N+1 queries, connection pooling, latency ||||
| ||||
|---> backend-system-architect -----------------------+|||||
| Skills: api-design-framework, ||||||
| error-handling-rfc9457 ||||||
| Focus: API contracts, error responses ||||||
| ||||||
+---> accessibility-specialist ---------------------+|||||||
Skills: wcag-patterns, aria-guidelines ||||||||
Focus: a11y (skipped -- no frontend files) ||||||||
||||||||
<----------- Results synthesized <------------------++++++++Each agent receives the full diff plus relevant skill context. The accessibility-specialist detects that all 9 changed files are backend Python files, reports "no frontend changes detected", and completes in under a second. The remaining five agents work in parallel.
The pr-size-warning hook fires immediately when it counts 420 changed lines. It injects a note into the review context: "Large PR (420 lines). Consider whether this should have been split into smaller PRs." This note appears in the final review output.
Step 3: Review Results
After all agents complete (typically 30-90 seconds for a full review), OrchestKit synthesizes their findings into a structured report:
PR #42 Review — "Add Stripe payment processing endpoint"
=====================================================
Quality: 8/10
Security: 6/10 <- 1 P0, 2 P1 findings
Tests: 7/10 3 missing test cases identified
Performance: 9/10 No N+1 queries detected
Design: 8/10 1 API contract suggestion
A11y: -- No frontend changes
------------------------------------------------------
SECURITY (security-auditor)
[P0] Webhook signature not verified
File: app/payments/webhook.py:34
The Stripe webhook handler processes events without
verifying the signature header. An attacker could forge
webhook payloads to credit accounts.
Fix: Use stripe.Webhook.construct_event() with your
webhook signing secret.
[P1] Stripe secret key in default parameter
File: app/payments/config.py:12
STRIPE_SECRET_KEY has a default value of "sk_test_..."
in the config. This will leak to version control.
Fix: Remove the default. Require it via env variable
with no fallback.
[P1] No idempotency key on charge creation
File: app/payments/service.py:67
stripe.PaymentIntent.create() is called without an
idempotency_key. Network retries could create duplicate
charges.
Fix: Accept an idempotency key from the client or
generate one from the order ID.
------------------------------------------------------
QUALITY (code-quality-reviewer)
[Suggestion] Extract Stripe client initialization
File: app/payments/service.py:15-22
Stripe is initialized inline in three methods. Extract
to a shared dependency or lifespan handler.
[Suggestion] PaymentService has 8 methods (complexity: medium)
Consider splitting webhook handling into a separate
WebhookService class.
------------------------------------------------------
TESTS (test-generator)
Missing test cases:
1. Webhook with invalid signature -> should return 400
2. Duplicate payment intent (idempotency) -> should not double-charge
3. Stripe API timeout -> should return 502 with retry-after
Coverage estimate: 74% (target: 85%)
------------------------------------------------------
DESIGN (backend-system-architect)
[Suggestion] POST /payments/charge returns 200 on success
Recommend 201 Created with a Location header pointing
to the payment resource: /payments/{payment_id}
------------------------------------------------------
PR SIZE WARNING: 420 lines across 9 files. Consider splitting
webhook handling into a separate PR for easier review.Step 4: Post to GitHub
OrchestKit asks whether to post the review as a PR comment:
Post this review as a comment on PR #42? [Y/n]
> Y
Review posted to PR #42
https://github.com/your-org/your-repo/pull/42#issuecomment-1234567The comment uses GitHub markdown with collapsible sections for each category, so it does not overwhelm the PR conversation. Security P0 findings are always expanded and highlighted.
P0 security findings block the review with a "Changes Requested" status. P1 findings are flagged but do not block. P2 findings are informational suggestions. The webhook signature vulnerability in this example is a P0 -- the PR should not merge until it is fixed.
Behind the Scenes
How the Review Diff is Distributed
OrchestKit does not send the entire diff to every agent. It routes files intelligently:
| Agent | Files Received | Rationale |
|---|---|---|
security-auditor | All 9 files | Security must see everything |
code-quality-reviewer | All 9 files | Style applies everywhere |
test-generator | Test files + source files they test | Needs both to assess coverage |
performance-engineer | Service + migration files | Where query and latency issues live |
backend-system-architect | Router + schema files | API surface area |
accessibility-specialist | Frontend files only | Skipped when none exist |
Hooks That Fired
| Hook | When | What It Did |
|---|---|---|
pr-size-warning | PR fetched | Detected 420 lines, injected size warning into review |
skill-suggester | Review started | Detected "payment" and "Stripe" keywords, injected input-validation and api-design-framework reference skills |
security-command-audit | After security scan | Logged the P0 finding to session metrics and audit trail |
auto-remember-continuity | Review complete | Stored "PR #42 has unverified webhook signatures" in memory for follow-up |
Security Severity Levels
Each security finding is assigned a severity that maps to review actions:
| Severity | Meaning | Review Action |
|---|---|---|
| P0 | Exploitable vulnerability | Changes Requested -- blocks merge |
| P1 | Real issue, requires specific conditions | Flagged -- fix before production |
| P2 | Hardening suggestion | Informational -- consider improving |
Skills Auto-Injected
Each agent received its standard skill set plus context-specific skills detected by the skill-suggester hook:
code-review-playbook-- Structured review methodologyowasp-top-10-- Common web vulnerability patternsdefense-in-depth-- Layered security approachapi-design-framework-- REST conventions, status codes, error formatserror-handling-rfc9457-- Problem Details standardpytest-advanced-- Test patterns, fixtures, parametrizeperformance-optimization-- N+1 detection, bundle analysisclean-architecture-- Separation of concerns, dependency boundaries
Tips
Use "Quick" for small PRs. If the PR is under 100 lines and touches well-tested code, the Quick review (quality + tests only) runs in under 15 seconds. Save the Full review for complex or sensitive changes.
Re-review after fixes. After the author pushes fixes for the P0 finding, run /ork:review-pr 42 again. OrchestKit fetches the updated diff and verifies the fixes are correct. It also checks that the fix did not introduce new issues.
Security findings use severity levels intentionally. P0 means "do not merge" -- the code has a vulnerability that could be exploited. P1 means "fix before production" -- the issue is real but requires specific conditions. P2 means "consider improving" -- a hardening suggestion rather than a vulnerability.
Combine with /ork:verify for pre-merge confidence. Run /ork:review-pr for the review comment, then /ork:verify locally to confirm tests pass and the security scan is clean. This catches issues that a diff-only review cannot detect, such as tests that pass individually but fail together.
Implement a Feature
From idea to merged PR with parallel AI agents — a complete walkthrough of the /ork:implement workflow.
Fix a GitHub Issue
From issue to PR in one command with root cause analysis, memory-powered investigation, and auto-linked pull requests.
Last updated on