Skip to main content
OrchestKit v7.43.0 — 104 skills, 36 agents, 173 hooks · Claude Code 2.1.105+
OrchestKit
Skills

Testing Patterns

Redirect — testing-patterns was split into 5 focused sub-skills. Use when looking for testing-patterns, writing tests, or test automation. Redirects to testing-unit, testing-e2e, testing-integration, testing-llm, or testing-perf.

Reference low

Auto-activated — this skill loads automatically when Claude detects matching context.

Testing Patterns (Redirect)

This skill was split into 5 focused sub-skills in v7.2.0. Use the appropriate sub-skill below.

Sub-Skills

Sub-SkillFocusWhen to Use
ork:testing-unitUnit tests, AAA pattern, fixtures, mocking, factoriesIsolated business logic tests
ork:testing-e2ePlaywright, page objects, visual regression, a11yBrowser-based end-to-end tests
ork:testing-integrationAPI endpoints, database, contract testingCross-boundary integration tests
ork:testing-llmLLM mocking, DeepEval/RAGAS, structured outputAI/ML evaluation and testing
ork:testing-perfk6, Locust, pytest-xdist, benchmarkPerformance and load testing

Quick Reference

/ork:testing-unit          # Unit testing patterns
/ork:testing-e2e           # End-to-end with Playwright
/ork:testing-integration   # API and database integration
/ork:testing-llm           # LLM evaluation patterns
/ork:testing-perf          # Performance and load testing
  • ork:testing-unit — Unit testing: AAA pattern, fixtures, mocking, factories
  • ork:testing-e2e — E2E testing: Playwright, page objects, visual regression
  • ork:testing-integration — Integration testing: API endpoints, database, contracts
  • ork:testing-llm — LLM testing: mock responses, DeepEval/RAGAS evaluation
  • ork:testing-perf — Performance testing: k6, Locust, pytest-xdist

Rules (2)

Assertions must test observable behavior, not implementation details — MEDIUM

Assert Behavior, Not Implementation

Why

Tests that assert implementation details (internal method calls, private state, specific SQL queries) break on every refactor even when the behavior is unchanged. This creates a maintenance burden where developers spend more time fixing tests than writing features — and the tests never catch real bugs.

Rule

Assertions must verify:

  1. Return values and output (what the function produces)
  2. Side effects on public interfaces (what changed externally)
  3. Error behavior (what happens when things go wrong)

Assertions must NOT verify:

  1. Internal method call counts or order
  2. Private state or internal data structures
  3. Specific implementation mechanisms (SQL text, HTTP internals)

Incorrect — testing implementation details

test("creates user", async () => {
  const spy = jest.spyOn(db, "query");

  await userService.create({ name: "Alice", email: "alice@test.com" });

  // Asserts exact SQL — breaks if column order changes or ORM upgrades
  expect(spy).toHaveBeenCalledWith(
    "INSERT INTO users (name, email) VALUES ($1, $2)",
    ["Alice", "alice@test.com"]
  );
  // Asserts call count — breaks if implementation adds a cache write
  expect(spy).toHaveBeenCalledTimes(1);
});
def test_process_order(mocker):
    mock_validate = mocker.patch.object(order_service, "_validate_items")
    mock_calc = mocker.patch.object(order_service, "_calculate_total")

    order_service.process(order)

    # Asserts internal method calls — breaks on any refactor
    mock_validate.assert_called_once_with(order.items)
    mock_calc.assert_called_once_with(order.items, order.discount)

Correct — testing observable behavior

test("creates user and returns it with generated id", async () => {
  const result = await userService.create({
    name: "Alice",
    email: "alice@test.com"
  });

  // Assert return value (behavior)
  expect(result.id).toBeDefined();
  expect(result.name).toBe("Alice");
  expect(result.email).toBe("alice@test.com");

  // Assert side effect via public interface (behavior)
  const fetched = await userService.findById(result.id);
  expect(fetched?.name).toBe("Alice");
});
def test_process_order_calculates_total():
    order = Order(
        items=[Item(price=10, qty=2), Item(price=5, qty=1)],
        discount=0.1
    )

    result = order_service.process(order)

    # Assert output (behavior)
    assert result.total == 22.50  # (10*2 + 5*1) * 0.9
    assert result.status == "confirmed"

Assertion Quality Checklist

// BAD: Asserts HOW (implementation)
expect(mockDb.query).toHaveBeenCalledWith(/* specific SQL */);
expect(internalCache.size).toBe(1);
expect(privatMethod).toHaveBeenCalledTimes(3);

// GOOD: Asserts WHAT (behavior)
expect(result.id).toBeDefined();              // Output value
expect(await service.findById(id)).toBeTruthy(); // Side effect
expect(() => service.create(invalid)).toThrow(); // Error behavior

When Mocks Are Acceptable

ScenarioMock OKAssert On
External API callsYes — mock the HTTP clientResponse handling behavior
Database in unit testsYes — mock the repositoryReturn values and errors
Internal private methodsNo — do not mockTest via public interface
Time-dependent logicYes — mock Date.nowCorrect output for given time
File systemYes — mock fs callsCorrect file content written

Tests must not depend on execution order or shared mutable state between test cases — HIGH

Test Isolation — No Shared State or Order Dependency

Why

Tests that share mutable state or depend on execution order are the #1 source of flaky tests. They pass when run sequentially in a specific order but fail when parallelized, shuffled, or run individually. Debugging flaky tests wastes more time than writing isolated tests.

Rule

Every test must:

  1. Set up its own state in beforeEach / setUp / arrange phase
  2. Clean up after itself (or use fresh fixtures)
  3. Pass when run in isolation (test.only / -k test_name)
  4. Pass when run in any order (shuffle mode)

Incorrect — tests share mutable state

// Shared state between tests — order dependent
describe("UserService", () => {
  const users: User[] = [];  // Shared mutable array

  test("creates a user", () => {
    users.push({ id: 1, name: "Alice" });
    expect(users).toHaveLength(1);
  });

  test("finds the user", () => {
    // DEPENDS on "creates a user" running first
    const found = users.find(u => u.name === "Alice");
    expect(found).toBeDefined();
  });

  test("deletes the user", () => {
    // DEPENDS on both previous tests
    users.splice(0, 1);
    expect(users).toHaveLength(0);
  });
});
# Module-level state — tests pollute each other
_cache = {}

def test_add_to_cache():
    _cache["key"] = "value"
    assert _cache["key"] == "value"

def test_cache_is_empty():
    # FAILS if test_add_to_cache runs first
    assert len(_cache) == 0

Correct — each test owns its state

describe("UserService", () => {
  let service: UserService;
  let db: TestDatabase;

  beforeEach(() => {
    db = new TestDatabase();        // Fresh DB per test
    service = new UserService(db);  // Fresh service per test
  });

  afterEach(() => {
    db.cleanup();
  });

  test("creates a user", async () => {
    const user = await service.create({ name: "Alice" });
    expect(user.id).toBeDefined();
  });

  test("finds a user by name", async () => {
    // Arrange: create its own user, does not rely on other tests
    await service.create({ name: "Bob" });

    // Act
    const found = await service.findByName("Bob");

    // Assert
    expect(found?.name).toBe("Bob");
  });
});
import pytest

@pytest.fixture
def cache():
    """Fresh cache for each test."""
    return {}

def test_add_to_cache(cache):
    cache["key"] = "value"
    assert cache["key"] == "value"

def test_cache_starts_empty(cache):
    # Always passes — gets its own empty cache
    assert len(cache) == 0

Verification

# Run tests in random order to detect order dependency
# Jest
jest --randomize

# pytest
pytest -p randomly

# Run single test in isolation
jest --testNamePattern="finds a user"
pytest -k "test_cache_starts_empty"

Common Isolation Violations

ViolationFix
Shared array/object between testsMove to beforeEach
Database not reset between testsUse transactions + rollback
Global singleton with stateReset in beforeEach or use DI
File system side effectsUse temp directories per test
Environment variable mutationsSave/restore in setup/teardown
Edit on GitHub

Last updated on