Skill Evolution
Analyzes skill usage patterns and suggests improvements. Use when reviewing skill performance, applying auto-suggested changes, or rolling back versions.
Skill Evolution Manager
Enables skills to automatically improve based on usage patterns, user edits, and success rates. Provides version control with safe rollback capability.
Overview
- Reviewing how skills are performing across sessions
- Identifying patterns in user edits to skill outputs
- Applying learned improvements to skill templates
- Rolling back problematic skill changes
- Tracking skill version history and success rates
Quick Reference
| Command | Description |
|---|---|
/ork:skill-evolution | Show evolution report for all skills |
/ork:skill-evolution analyze <skill-id> | Analyze specific skill patterns |
/ork:skill-evolution evolve <skill-id> | Review and apply suggestions |
/ork:skill-evolution history <skill-id> | Show version history |
/ork:skill-evolution rollback <skill-id> <version> | Restore previous version |
How It Works
The skill evolution system operates in three phases:
COLLECT ANALYZE ACT
─────── ─────── ───
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ PostTool │──────────▶│ Evolution │──────────▶│ /ork:skill- │
│ Edit │ patterns │ Analyzer │ suggest │ evolution │
│ Tracker │ │ Engine │ │ command │
└─────────────┘ └─────────────┘ └─────────────┘
│ │ │
▼ ▼ ▼
┌─────────────┐ ┌─────────────┐ ┌─────────────┐
│ edit- │ │ evolution- │ │ versions/ │
│ patterns. │ │ registry. │ │ snapshots │
│ jsonl │ │ json │ │ │
└─────────────┘ └─────────────┘ └─────────────┘See Pattern Detection Heuristics for tracked edit patterns and detection regexes. See Confidence Scoring for suggestion thresholds.
Subcommands
Each subcommand is documented with implementation details, shell commands, and sample output in the Evolution Commands Reference.
Report (Default)
/ork:skill-evolution — Shows evolution report for all tracked skills with usage counts, success rates, and pending suggestions.
Analyze
/ork:skill-evolution analyze <skill-id> — Deep-dives into edit patterns for a specific skill, showing frequency, sample counts, and confidence scores.
Evolve
/ork:skill-evolution evolve <skill-id> — Interactive review of improvement suggestions. Uses AskUserQuestion for each suggestion (Apply / Skip / Reject). Creates version snapshot before applying.
History
/ork:skill-evolution history <skill-id> — Shows version history with performance metrics per version.
Rollback
/ork:skill-evolution rollback <skill-id> <version> — Restores a previous version after confirmation. Current version is backed up automatically.
Data Files
| File | Purpose | Format |
|---|---|---|
.claude/feedback/edit-patterns.jsonl | Raw edit pattern events | JSONL (append-only) |
.claude/feedback/evolution-registry.json | Aggregated suggestions | JSON |
.claude/feedback/metrics.json | Skill usage metrics | JSON |
skills/<cat>/<name>/versions/ | Version snapshots | Directory |
skills/<cat>/<name>/versions/manifest.json | Version metadata | JSON |
Auto-Evolution Safety
See Auto-Evolution Triggers for full safety mechanisms, health monitoring, and trigger criteria.
Key safeguards: version snapshots before changes, auto-alert on >20% success rate drop, human review required, rejected suggestions never re-suggested.
References
- Evolution Commands Reference — Subcommand implementation, shell commands, and sample output
- Evolution Analysis Methodology
- Version Management Guide
Rules
- Pattern Detection Heuristics — Edit pattern categories and regex detection
- Confidence Scoring — Suggestion thresholds and confidence criteria
- Auto-Evolution Triggers — Safety mechanisms and trigger criteria
Related Skills
ork:configure- Configure OrchestKit settingsork:doctor- Diagnose OrchestKit issuesfeedback-dashboard- View comprehensive feedback metrics
Rules (3)
Auto-Evolution Triggers — HIGH
Auto-Evolution Safety & Trigger Criteria
Safety Mechanisms
- Version Snapshots: Always created before changes
- Rollback Triggers: Auto-alert if success rate drops >20%
- Human Review: High-confidence suggestions require approval
- Rejection Memory: Rejected suggestions are never re-suggested
Health Monitoring
The system monitors skill health and can trigger warnings:
WARNING: api-design-framework success rate dropped from 94% to 71%
Consider: /ork:skill-evolution rollback api-design-framework 1.1.0When Auto-Evolution Activates
- Pattern frequency exceeds the Add Threshold (70%)
- At least Minimum Samples (5) uses recorded
- No prior rejection for the same pattern on the same skill
- Current skill version success rate is stable (no recent drops)
When Rollback Is Triggered
- Success rate drops more than 20% after an evolution
- Alert is surfaced in the next
reportoranalyzeinvocation - User is prompted to rollback via AskUserQuestion
Confidence Scoring — HIGH
Confidence Scoring & Suggestion Thresholds
Thresholds
| Threshold | Default | Description |
|---|---|---|
| Minimum Samples | 5 | Uses before generating suggestions |
| Add Threshold | 70% | Frequency to suggest adding pattern |
| Auto-Apply Confidence | 85% | Confidence for auto-application |
| Rollback Trigger | -20% | Success rate drop to trigger rollback |
Confidence Calculation
Confidence is calculated as the ratio of users who apply a pattern to total uses:
confidence = pattern_frequency / total_uses- Below 70%: Pattern tracked but no suggestion generated
- 70%-84%: Suggestion generated, requires human approval via
evolvesubcommand - 85%+: Auto-apply eligible (still requires human confirmation via AskUserQuestion)
Suggestion States
Suggestions progress through: pending → applied | rejected
- Applied: Pattern added to skill template, version bumped
- Rejected: Marked in registry, never re-suggested for this skill
Pattern Detection Heuristics — HIGH
Edit Pattern Detection Heuristics
The system tracks these common edit patterns users apply after skill output:
| Pattern | Description | Detection Regex |
|---|---|---|
add_pagination | User adds pagination to API responses | limit.*offset, cursor.*pagination |
add_rate_limiting | User adds rate limiting | rate.?limit, throttl |
add_error_handling | User adds try/catch blocks | try.*catch, except |
add_types | User adds TypeScript/Python types | interface\s, Optional |
add_validation | User adds input validation | validate, Pydantic, Zod |
add_logging | User adds logging/observability | logger\., console.log |
remove_comments | User removes generated comments | Pattern removal detection |
add_auth_check | User adds authentication checks | @auth, @require_auth |
How Detection Works
The PostTool Edit Tracker hook monitors file edits after skill invocations. When a user edits skill output, the edit is classified against the patterns above using regex matching. Results are appended to .claude/feedback/edit-patterns.jsonl.
References (3)
Evolution Analysis
Evolution Analysis Methodology
Reference guide for understanding how the skill evolution system analyzes patterns and generates suggestions.
Pattern Detection Algorithm
1. Data Collection (PostTool Hook)
When a Write or Edit tool is used after a skill was recently loaded:
IF skill_loaded_within(5_minutes) AND tool IN (Write, Edit):
content = get_edit_content()
patterns = detect_patterns(content)
IF patterns.length > 0:
log_to_edit_patterns_jsonl(skill_id, patterns)2. Pattern Matching
The system uses regex patterns to categorize edits:
PATTERN_DETECTORS=(
["add_pagination"]="limit.*offset|page.*size|cursor.*pagination|Paginated"
["add_rate_limiting"]="rate.?limit|throttl|RateLimiter|requests.?per"
["add_caching"]="@cache|cache_key|TTL|redis|memcache|@cached"
["add_retry_logic"]="retry|backoff|max_attempts|tenacity|Retry"
["add_error_handling"]="try.*catch|except|raise.*Exception|throw.*Error"
["add_validation"]="validate|Validator|@validate|Pydantic|Zod|yup"
["add_logging"]="logger\.|logging\.|console\.log|winston|pino"
["add_types"]=": *(str|int|bool|List|Dict|Optional)|interface\s|type\s.*="
["add_auth_check"]="@auth|@require_auth|isAuthenticated|requiresAuth"
["add_test_case"]="def test_|it\(|describe\(|expect\(|@pytest"
)3. Frequency Calculation
For each skill with sufficient usage:
frequency = pattern_count / total_skill_uses4. Confidence Scoring
Confidence combines frequency with sample size:
confidence = frequency × min(samples / 20, 1.0)This means:
- 100% frequency with 5 samples = 0.25 confidence (needs more data)
- 100% frequency with 20+ samples = 1.0 confidence (high certainty)
- 70% frequency with 15 samples = 0.53 confidence (moderate)
Suggestion Thresholds
| Metric | Threshold | Purpose |
|---|---|---|
| MIN_SAMPLES | 5 | Prevent premature suggestions |
| ADD_THRESHOLD | 0.70 | 70%+ users add = suggest adding |
| REMOVE_THRESHOLD | 0.70 | 70%+ users remove = suggest removing |
| AUTO_APPLY_CONFIDENCE | 0.85 | Auto-apply if very high confidence |
Suggestion Types
Add Suggestions
Generated when users frequently add similar content:
{
"type": "add",
"target": "template",
"pattern": "add_pagination",
"reason": "85% of users add pagination after using this skill"
}Remove Suggestions
Generated when users frequently remove generated content:
{
"type": "remove",
"target": "template",
"pattern": "remove_comments",
"reason": "72% of users remove docstrings from generated code"
}Analysis Best Practices
- Wait for sufficient data: Don't act on suggestions until MIN_SAMPLES reached
- Review high-confidence first: Focus on suggestions with confidence > 0.80
- Consider context: A pattern may be added for specific use cases only
- Monitor after changes: Track success rate changes after evolution
Interpreting Results
High-Value Improvements
- Frequency > 80%, Confidence > 0.70
- Pattern is universally applicable
- Easy to add to skill template
Conditional Improvements
- Frequency 50-80%
- May be context-dependent
- Consider adding as optional reference
Skip/Investigate
- Frequency < 50%
- Might be edge case or user preference
- Review individual edit patterns for context
Evolution Commands
Evolution Subcommand Reference
Detailed implementation and sample output for each subcommand.
Subcommand: Report (Default)
Usage: /ork:skill-evolution
Shows evolution report for all tracked skills.
Implementation
# Run the evolution engine report
"${CLAUDE_PROJECT_DIR}/.claude/scripts/evolution-engine.sh" reportSample Output
Skill Evolution Report
══════════════════════════════════════════════════════════════
Skills Summary:
┌────────────────────────────┬─────────┬─────────┬───────────┬────────────┐
│ Skill │ Uses │ Success │ Avg Edits │ Suggestions│
├────────────────────────────┼─────────┼─────────┼───────────┼────────────┤
│ api-design-framework │ 156 │ 94% │ 1.8 │ 2 │
│ database-schema-designer │ 89 │ 91% │ 2.1 │ 1 │
│ fastapi-patterns │ 67 │ 88% │ 2.4 │ 3 │
└────────────────────────────┴─────────┴─────────┴───────────┴────────────┘
Summary:
Skills tracked: 3
Total uses: 312
Overall success rate: 91%
Top Pending Suggestions:
1. 93% | api-design-framework | add add_pagination
2. 88% | api-design-framework | add add_rate_limiting
3. 85% | fastapi-patterns | add add_error_handlingSubcommand: Analyze
Usage: /ork:skill-evolution analyze <skill-id>
Analyzes edit patterns for a specific skill.
Implementation
# Run analysis for specific skill
"${CLAUDE_PROJECT_DIR}/.claude/scripts/evolution-engine.sh" analyze "$SKILL_ID"Sample Output
Skill Analysis: api-design-framework
────────────────────────────────────
Uses: 156 | Success: 94% | Avg Edits: 1.8
Edit Patterns Detected:
┌──────────────────────────┬─────────┬──────────┬────────────┐
│ Pattern │ Freq │ Samples │ Confidence │
├──────────────────────────┼─────────┼──────────┼────────────┤
│ add_pagination │ 85% │ 132/156 │ 0.93 │
│ add_rate_limiting │ 72% │ 112/156 │ 0.88 │
│ add_error_handling │ 45% │ 70/156 │ 0.56 │
└──────────────────────────┴─────────┴──────────┴────────────┘
Pending Suggestions:
1. 93% conf: ADD add_pagination to template
2. 88% conf: ADD add_rate_limiting to template
Run `/ork:skill-evolution evolve api-design-framework` to reviewSubcommand: Evolve
Usage: /ork:skill-evolution evolve <skill-id>
Interactive review and application of improvement suggestions.
Implementation
- Get Suggestions:
SUGGESTIONS=$("${CLAUDE_PROJECT_DIR}/.claude/scripts/evolution-engine.sh" suggest "$SKILL_ID")- For Each Suggestion, Present Interactive Options:
Use AskUserQuestion to let the user decide on each suggestion:
{
"questions": [{
"question": "Apply suggestion: ADD add_pagination to template? (93% confidence, 132/156 users add this)",
"header": "Evolution",
"options": [
{"label": "Apply", "description": "Add this pattern to the skill template"},
{"label": "Skip", "description": "Skip for now, ask again later"},
{"label": "Reject", "description": "Never suggest this again"}
],
"multiSelect": false
}]
}-
On Apply:
- Create version snapshot first
- Apply the suggestion to skill files
- Update evolution registry
-
On Reject:
- Mark suggestion as rejected in registry
- Will not be suggested again
Applying Suggestions
When a user accepts a suggestion, the implementation depends on the suggestion type:
For add suggestions to templates:
- Add the pattern to the skill's template files
- Update SKILL.md with new guidance
For add suggestions to references:
- Create new reference file in
references/directory
For remove suggestions:
- Remove the identified content
- Archive in version snapshot first
Subcommand: History
Usage: /ork:skill-evolution history <skill-id>
Shows version history with performance metrics.
Implementation
# Run version manager list
"${CLAUDE_PROJECT_DIR}/.claude/scripts/version-manager.sh" list "$SKILL_ID"Sample Output
Version History: api-design-framework
══════════════════════════════════════════════════════════════
Current Version: 1.2.0
┌─────────┬────────────┬─────────┬───────┬───────────┬────────────────────────────┐
│ Version │ Date │ Success │ Uses │ Avg Edits │ Changelog │
├─────────┼────────────┼─────────┼───────┼───────────┼────────────────────────────┤
│ 1.2.0 │ 2026-01-14 │ 94% │ 156 │ 1.8 │ Added pagination pattern │
│ 1.1.0 │ 2026-01-05 │ 89% │ 80 │ 2.3 │ Added error handling ref │
│ 1.0.0 │ 2025-11-01 │ 78% │ 45 │ 3.2 │ Initial release │
└─────────┴────────────┴─────────┴───────┴───────────┴────────────────────────────┘Subcommand: Rollback
Usage: /ork:skill-evolution rollback <skill-id> <version>
Restores a skill to a previous version.
Implementation
- Confirm with User:
Use AskUserQuestion for confirmation:
{
"questions": [{
"question": "Rollback api-design-framework from 1.2.0 to 1.0.0? Current version will be backed up.",
"header": "Rollback",
"options": [
{"label": "Confirm Rollback", "description": "Restore version 1.0.0"},
{"label": "Cancel", "description": "Keep current version"}
],
"multiSelect": false
}]
}- On Confirm:
"${CLAUDE_PROJECT_DIR}/.claude/scripts/version-manager.sh" restore "$SKILL_ID" "$VERSION"- Report Result:
Restored api-design-framework to version 1.0.0
Previous version backed up to: versions/.backup-1.2.0-1736867234Version Management
Version Management Guide
Reference guide for managing skill versions with safe rollback capability.
Version Structure
Each skill can have versioned snapshots stored in:
skills/<category>/<skill-name>/
├── SKILL.md # Current version
├── SKILL.md # Current metadata
├── references/ # Current references
├── scripts/ # Current templates
└── versions/
├── manifest.json # Version history metadata
├── 1.0.0/
│ ├── SKILL.md
│ ├── SKILL.md
│ ├── references/
│ └── CHANGELOG.md
└── 1.1.0/
├── SKILL.md
├── SKILL.md
├── references/
└── CHANGELOG.mdManifest Schema
The manifest.json tracks version history:
{
"$schema": "../../../../../../.claude/schemas/skill-evolution.schema.json",
"skillId": "api-design-framework",
"currentVersion": "1.2.0",
"versions": [
{
"version": "1.0.0",
"date": "2025-11-01",
"successRate": 0.78,
"uses": 45,
"avgEdits": 3.2,
"changelog": "Initial release"
},
{
"version": "1.1.0",
"date": "2026-01-05",
"successRate": 0.89,
"uses": 80,
"avgEdits": 1.8,
"changelog": "Added pagination pattern (85% users added manually)"
}
],
"suggestions": [],
"editPatterns": {},
"lastAnalyzed": "2026-01-14T10:30:00Z"
}Versioning Workflow
Creating a Version
-
Before making changes, create a version snapshot:
version-manager.sh create <skill-id> "Description of changes" -
The system:
- Bumps version number (patch by default)
- Copies current files to
versions/<new-version>/ - Records current metrics in manifest
- Creates CHANGELOG.md
Comparing Versions
Compare two versions to see what changed:
version-manager.sh diff <skill-id> 1.0.0 1.1.0Shows:
- File differences (unified diff)
- Metrics comparison (success rate, uses, avg edits)
Restoring a Version
If a change causes problems, rollback:
version-manager.sh restore <skill-id> <version>The system:
- Backs up current version to
.backup-<version>-<timestamp> - Copies snapshot files to skill root
- Updates manifest with rollback entry
Automatic Safety Checks
Rollback Triggers
The system monitors for:
| Trigger | Threshold | Action |
|---|---|---|
| Success rate drop | -20% | Warning + rollback suggestion |
| Avg edits increase | +50% | Warning (users fighting skill) |
| Consecutive failures | 5+ | Alert to review |
Health Check Integration
The posttool hooks monitor skill health:
check_skill_health() {
local skill_id="$1"
local current_rate=$(get_recent_success_rate "$skill_id" 10)
local baseline_rate=$(get_version_baseline "$skill_id")
if (( $(echo "$baseline_rate - $current_rate > 0.20" | bc -l) )); then
echo "WARNING: $skill_id dropped from ${baseline_rate} to ${current_rate}"
fi
}Best Practices
When to Create Versions
- Before applying evolution suggestions
- Before major skill modifications
- After validating improvements work well
- At regular intervals (weekly/monthly) for active skills
Version Naming
Use semantic versioning:
- Major (2.0.0): Breaking changes to skill behavior
- Minor (1.1.0): New features/patterns added
- Patch (1.0.1): Bug fixes, minor improvements
Cleanup Policy
- Keep last 5 versions minimum
- Archive versions older than 90 days
- Never delete versions with good metrics (baseline references)
Metrics Interpretation
Success Rate Trends
| Pattern | Interpretation |
|---|---|
| Increasing | Evolution working well |
| Stable | Skill mature and effective |
| Decreasing | Investigate recent changes |
Average Edits Trends
| Pattern | Interpretation |
|---|---|
| Decreasing | Skill producing better output |
| Stable | Consistent quality |
| Increasing | Users modifying more (skill may need updates) |
Recovery Scenarios
Accidental Breaking Change
# 1. Check history
version-manager.sh list <skill-id>
# 2. Find last good version
version-manager.sh metrics <skill-id>
# 3. Restore
version-manager.sh restore <skill-id> 1.1.0Gradual Degradation
# 1. Compare versions
version-manager.sh diff <skill-id> 1.0.0 1.2.0
# 2. Identify problematic changes
# 3. Create new version fixing issuesSecurity Patterns
Security patterns for authentication, defense-in-depth, input validation, OWASP Top 10, LLM safety, and PII masking. Use when implementing auth flows, security layers, input sanitization, vulnerability prevention, prompt injection defense, or data redaction.
Task Dependency Patterns
CC 2.1.16 Task Management patterns with TaskCreate, TaskUpdate, TaskGet, TaskList tools. Decompose complex work into trackable tasks with dependency chains. Use when managing multi-step implementations, coordinating parallel work, or tracking completion status.
Last updated on