mirror of
https://github.com/Hestia-Homes/agentic-toolkit.git
synced 2026-06-08 11:37:26 +00:00
commit
47f7b1b828
2 changed files with 160 additions and 8 deletions
|
|
@ -33,14 +33,6 @@
|
|||
"skillFolderHash": "a500c5b9a481e6fac8c9bb87cd4c4e16c3f46e1a",
|
||||
"pluginName": "mattpocock-skills"
|
||||
},
|
||||
"tdd": {
|
||||
"source": "mattpocock/skills",
|
||||
"sourceType": "github",
|
||||
"sourceUrl": "https://github.com/mattpocock/skills.git",
|
||||
"skillPath": "skills/engineering/tdd/SKILL.md",
|
||||
"skillFolderHash": "75beb3030b4c979205dd771ff85ac600baeb68f4",
|
||||
"pluginName": "mattpocock-skills"
|
||||
},
|
||||
"to-issues": {
|
||||
"source": "mattpocock/skills",
|
||||
"sourceType": "github",
|
||||
|
|
|
|||
160
skills/engineering/tdd/SKILL.md
Normal file
160
skills/engineering/tdd/SKILL.md
Normal file
|
|
@ -0,0 +1,160 @@
|
|||
# Test-Driven Development (3A)
|
||||
|
||||
## Philosophy
|
||||
|
||||
**Core principle**: Tests should verify behavior through public interfaces, not implementation details. Code can change entirely; tests shouldn't.
|
||||
|
||||
**Good tests** are integration-style: they exercise real code paths through public APIs. They describe _what_ the system does, not _how_ it does it. A good test reads like a specification - "user can checkout with valid cart" tells you exactly what capability exists. These tests survive refactors because they don't care about internal structure.
|
||||
|
||||
**Bad tests** are coupled to implementation. They mock internal collaborators, test private methods, or verify through external means (like querying a database directly instead of using the interface). The warning sign: your test breaks when you refactor, but behavior hasn't changed. If you rename an internal function and tests fail, those tests were testing implementation, not behavior.
|
||||
|
||||
Tests exercise real code paths through public APIs; mocks are a last resort for I/O boundaries only.
|
||||
|
||||
## Anti-Pattern: Horizontal Slices
|
||||
|
||||
**DO NOT write all tests first, then all implementation.** This is "horizontal slicing" - treating RED as "write all tests" and GREEN as "write all code."
|
||||
|
||||
This produces **crap tests**:
|
||||
|
||||
- Tests written in bulk test _imagined_ behavior, not _actual_ behavior
|
||||
- You end up testing the _shape_ of things (data structures, function signatures) rather than user-facing behavior
|
||||
- Tests become insensitive to real changes - they pass when behavior breaks, fail when behavior is fine
|
||||
- You outrun your headlights, committing to test structure before understanding the implementation
|
||||
|
||||
**Correct approach**: Vertical slices via tracer bullets. One test → one implementation → repeat. Each test responds to what you learned from the previous cycle. Because you just wrote the code, you know exactly what behavior matters and how to verify it.
|
||||
|
||||
```
|
||||
WRONG (horizontal):
|
||||
RED: test1, test2, test3, test4, test5
|
||||
GREEN: impl1, impl2, impl3, impl4, impl5
|
||||
|
||||
RIGHT (vertical):
|
||||
RED→GREEN: test1→impl1
|
||||
RED→GREEN: test2→impl2
|
||||
RED→GREEN: test3→impl3
|
||||
...
|
||||
```
|
||||
|
||||
## Workflow
|
||||
|
||||
### 1. Planning
|
||||
|
||||
When exploring the codebase, use the project's domain glossary so that test names and interface vocabulary match the project's language, and respect ADRs in the area you're touching.
|
||||
|
||||
Before writing any code:
|
||||
|
||||
- [ ] Confirm with user what interface changes are needed
|
||||
- [ ] Confirm with user which behaviors to test (prioritize)
|
||||
- [ ] Identify opportunities for deep modules (small interface, deep implementation)
|
||||
- [ ] Design interfaces for testability
|
||||
- [ ] List the behaviors to test (not implementation steps)
|
||||
- [ ] Get user approval on the plan
|
||||
|
||||
Ask: "What should the public interface look like? Which behaviors are most important to test?"
|
||||
|
||||
**You can't test everything.** Confirm with the user exactly which behaviors matter most. Focus testing effort on critical paths and complex logic, not every possible edge case.
|
||||
|
||||
### 2. Tracer Bullet
|
||||
|
||||
Write ONE test that confirms ONE thing about the system:
|
||||
|
||||
```
|
||||
RED: Write test for first behavior → test fails
|
||||
GREEN: Write minimal code to pass → test passes
|
||||
```
|
||||
|
||||
This is your tracer bullet - proves the path works end-to-end.
|
||||
|
||||
### 3. Incremental Loop
|
||||
|
||||
For each remaining behavior:
|
||||
|
||||
```
|
||||
RED: Write next test → fails
|
||||
GREEN: Minimal code to pass → passes
|
||||
```
|
||||
|
||||
Rules:
|
||||
|
||||
- One test at a time
|
||||
- Only enough code to pass current test
|
||||
- Don't anticipate future tests
|
||||
- Keep tests focused on observable behavior
|
||||
|
||||
### 4. Refactor
|
||||
|
||||
After all tests pass, look for refactor candidates:
|
||||
|
||||
- [ ] Extract duplication
|
||||
- [ ] Deepen modules (move complexity behind simple interfaces)
|
||||
- [ ] Apply SOLID principles where natural
|
||||
- [ ] Consider what new code reveals about existing code
|
||||
- [ ] Run tests after each refactor step
|
||||
|
||||
**Never refactor while RED.** Get to GREEN first.
|
||||
|
||||
## Checklist Per Cycle
|
||||
|
||||
```
|
||||
[ ] Test describes behavior, not implementation
|
||||
[ ] Test uses public interface only
|
||||
[ ] Test would survive internal refactor
|
||||
[ ] Code is minimal for this test
|
||||
[ ] No speculative features added
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Test Structure: Arrange-Act-Assert
|
||||
|
||||
Every test must be structured in three clearly separated phases:
|
||||
|
||||
- **Arrange** — set up inputs, state, and dependencies
|
||||
- **Act** — invoke the unit under test (one call per test)
|
||||
- **Assert** — verify the single observable outcome
|
||||
|
||||
Keep each phase visually distinct with a blank line between phases and inline `// Arrange / // Act / // Assert` comments, where the framework doesn't make the separation obvious.
|
||||
|
||||
---
|
||||
|
||||
## Stub-First Reds
|
||||
|
||||
Before writing a failing test, define stub implementations for all functions, methods, and classes the test will import. Stubs must:
|
||||
|
||||
- Exist at the correct import path
|
||||
- Have the correct signature
|
||||
- Raise / throw a "not implemented" error (e.g. `raise NotImplementedError`, `throw new Error('not implemented')`)
|
||||
|
||||
This ensures RED tests fail because the **behavior is missing**, not because of import or attribute errors. A test that errors on import is not a RED test — it is a broken test.
|
||||
|
||||
---
|
||||
|
||||
## Red-Green-Refactor: Stage Gates
|
||||
|
||||
After completing each stage, **stop and wait** before proceeding to the next. At each gate:
|
||||
|
||||
1. **After RED** — present the suggested commit message, run tests to confirm they fail for the right reason (not import/attribute errors), then wait for the user to commit manually.
|
||||
2. **After GREEN** — present the suggested commit message, confirm all tests pass, then wait for the user to commit manually.
|
||||
3. **After REFACTOR** — present the suggested commit message, confirm tests still pass and no behavior changed, then wait for the user to commit manually.
|
||||
|
||||
Do not continue to the next stage until the user has committed and explicitly given the go-ahead.
|
||||
|
||||
### Commit Message Format
|
||||
|
||||
One sentence describing the **behavior being delivered**, followed by a stage emoji. The sentence must describe what the system now does, not the TDD action taken.
|
||||
|
||||
| Stage | Emoji |
|
||||
|----------|-------|
|
||||
| Red | 🟥 |
|
||||
| Green | 🟩 |
|
||||
| Refactor | 🟪 |
|
||||
|
||||
**Examples:**
|
||||
|
||||
```
|
||||
Map API response to domain object 🟥
|
||||
Map API response to domain object 🟩
|
||||
Map API response to domain object 🟪
|
||||
```
|
||||
|
||||
Not: "wrote failing mapping tests" or "made tests pass" — those describe the process, not the behavior.
|
||||
Loading…
Add table
Reference in a new issue