Merge pull request #2 from Hestia-Homes/tdd-3a

tdd skill
2026-06-08 11:37:26 +00:00 · 2026-05-11 12:47:32 +01:00 · 2026-05-11 12:47:32 +01:00 · 47f7b1b828
commit 47f7b1b828
parent 3e04fe14ae 0d67119bdb
2 changed files with 160 additions and 8 deletions
--- a/skills-lock.json
+++ b/skills-lock.json
@ -33,14 +33,6 @@
      "skillFolderHash": "a500c5b9a481e6fac8c9bb87cd4c4e16c3f46e1a",
      "pluginName": "mattpocock-skills"
    },
-    "tdd": {
-      "source": "mattpocock/skills",
-      "sourceType": "github",
-      "sourceUrl": "https://github.com/mattpocock/skills.git",
-      "skillPath": "skills/engineering/tdd/SKILL.md",
-      "skillFolderHash": "75beb3030b4c979205dd771ff85ac600baeb68f4",
-      "pluginName": "mattpocock-skills"
-    },
    "to-issues": {
      "source": "mattpocock/skills",
      "sourceType": "github",
--- a/skills/engineering/tdd/SKILL.md
+++ b/skills/engineering/tdd/SKILL.md
@ -0,0 +1,160 @@
+# Test-Driven Development (3A)
+
+## Philosophy
+
+**Core principle**: Tests should verify behavior through public interfaces, not implementation details. Code can change entirely; tests shouldn't.
+
+**Good tests** are integration-style: they exercise real code paths through public APIs. They describe _what_ the system does, not _how_ it does it. A good test reads like a specification - "user can checkout with valid cart" tells you exactly what capability exists. These tests survive refactors because they don't care about internal structure.
+
+**Bad tests** are coupled to implementation. They mock internal collaborators, test private methods, or verify through external means (like querying a database directly instead of using the interface). The warning sign: your test breaks when you refactor, but behavior hasn't changed. If you rename an internal function and tests fail, those tests were testing implementation, not behavior.
+
+Tests exercise real code paths through public APIs; mocks are a last resort for I/O boundaries only.
+
+## Anti-Pattern: Horizontal Slices
+
+**DO NOT write all tests first, then all implementation.** This is "horizontal slicing" - treating RED as "write all tests" and GREEN as "write all code."
+
+This produces **crap tests**:
+
+- Tests written in bulk test _imagined_ behavior, not _actual_ behavior
+- You end up testing the _shape_ of things (data structures, function signatures) rather than user-facing behavior
+- Tests become insensitive to real changes - they pass when behavior breaks, fail when behavior is fine
+- You outrun your headlights, committing to test structure before understanding the implementation
+
+**Correct approach**: Vertical slices via tracer bullets. One test → one implementation → repeat. Each test responds to what you learned from the previous cycle. Because you just wrote the code, you know exactly what behavior matters and how to verify it.
+
+```
+WRONG (horizontal):
+  RED:   test1, test2, test3, test4, test5
+  GREEN: impl1, impl2, impl3, impl4, impl5
+
+RIGHT (vertical):
+  RED→GREEN: test1→impl1
+  RED→GREEN: test2→impl2
+  RED→GREEN: test3→impl3
+  ...
+```
+
+## Workflow
+
+### 1. Planning
+
+When exploring the codebase, use the project's domain glossary so that test names and interface vocabulary match the project's language, and respect ADRs in the area you're touching.
+
+Before writing any code:
+
+- [ ] Confirm with user what interface changes are needed
+- [ ] Confirm with user which behaviors to test (prioritize)
+- [ ] Identify opportunities for deep modules (small interface, deep implementation)
+- [ ] Design interfaces for testability
+- [ ] List the behaviors to test (not implementation steps)
+- [ ] Get user approval on the plan
+
+Ask: "What should the public interface look like? Which behaviors are most important to test?"
+
+**You can't test everything.** Confirm with the user exactly which behaviors matter most. Focus testing effort on critical paths and complex logic, not every possible edge case.
+
+### 2. Tracer Bullet
+
+Write ONE test that confirms ONE thing about the system:
+
+```
+RED:   Write test for first behavior → test fails
+GREEN: Write minimal code to pass → test passes
+```
+
+This is your tracer bullet - proves the path works end-to-end.
+
+### 3. Incremental Loop
+
+For each remaining behavior:
+
+```
+RED:   Write next test → fails
+GREEN: Minimal code to pass → passes
+```
+
+Rules:
+
+- One test at a time
+- Only enough code to pass current test
+- Don't anticipate future tests
+- Keep tests focused on observable behavior
+
+### 4. Refactor
+
+After all tests pass, look for refactor candidates:
+
+- [ ] Extract duplication
+- [ ] Deepen modules (move complexity behind simple interfaces)
+- [ ] Apply SOLID principles where natural
+- [ ] Consider what new code reveals about existing code
+- [ ] Run tests after each refactor step
+
+**Never refactor while RED.** Get to GREEN first.
+
+## Checklist Per Cycle
+
+```
+[ ] Test describes behavior, not implementation
+[ ] Test uses public interface only
+[ ] Test would survive internal refactor
+[ ] Code is minimal for this test
+[ ] No speculative features added
+```
+
+---
+
+## Test Structure: Arrange-Act-Assert
+
+Every test must be structured in three clearly separated phases:
+
+- **Arrange** — set up inputs, state, and dependencies
+- **Act** — invoke the unit under test (one call per test)
+- **Assert** — verify the single observable outcome
+
+Keep each phase visually distinct with a blank line between phases and inline `// Arrange / // Act / // Assert` comments, where the framework doesn't make the separation obvious.
+
+---
+
+## Stub-First Reds
+
+Before writing a failing test, define stub implementations for all functions, methods, and classes the test will import. Stubs must:
+
+- Exist at the correct import path
+- Have the correct signature
+- Raise / throw a "not implemented" error (e.g. `raise NotImplementedError`, `throw new Error('not implemented')`)
+
+This ensures RED tests fail because the **behavior is missing**, not because of import or attribute errors. A test that errors on import is not a RED test — it is a broken test.
+
+---
+
+## Red-Green-Refactor: Stage Gates
+
+After completing each stage, **stop and wait** before proceeding to the next. At each gate:
+
+1. **After RED** — present the suggested commit message, run tests to confirm they fail for the right reason (not import/attribute errors), then wait for the user to commit manually.
+2. **After GREEN** — present the suggested commit message, confirm all tests pass, then wait for the user to commit manually.
+3. **After REFACTOR** — present the suggested commit message, confirm tests still pass and no behavior changed, then wait for the user to commit manually.
+
+Do not continue to the next stage until the user has committed and explicitly given the go-ahead.
+
+### Commit Message Format
+
+One sentence describing the **behavior being delivered**, followed by a stage emoji. The sentence must describe what the system now does, not the TDD action taken.
+
+| Stage    | Emoji |
+|----------|-------|
+| Red      | 🟥    |
+| Green    | 🟩    |
+| Refactor | 🟪    |
+
+**Examples:**
+
+```
+Map API response to domain object 🟥
+Map API response to domain object 🟩
+Map API response to domain object 🟪
+```
+
+Not: "wrote failing mapping tests" or "made tests pass" — those describe the process, not the behavior.