Model/domain/sap10_calculator/README.md
Khalim Conn-Kowlessar a7b08a4e8f refactor: move docs/sap-spec/ contents into domain/sap10_calculator/
Locality of reference — SAP-specific docs, specs, and runtime data
now live alongside the calculator that consumes them, mirroring the
prior packages→domain layout moves.

Move targets:

- Narrative MDs → domain/sap10_calculator/docs/
    NEXT_AGENT_PROMPT.md, HANDOVER_NEXT.md, SAP_CALCULATOR.md
- Spec PDFs → domain/sap10_calculator/docs/specs/
    RdSAP 10 Specification 10-06-2025.pdf
    PCDF_Spec_Rev-06b_12_May_2021.pdf
    sap-10-2-full-specification-2025-03-14.pdf
    sap-10-3-full-specification-2026-01-13.pdf
- PCDB runtime data → domain/sap10_calculator/tables/pcdb/data/
    pcdb10.dat (8.3MB) + 7× pcdb_table_*.jsonl (18MB total)

Path code rewrites (load-bearing):

- tables/pcdb/__init__.py: replaced parents[4]/'docs'/'sap-spec' with
  Path(__file__).resolve().parent/'data' for Table 105 JSONL loading.
- tables/pcdb/postcode_weather.py: same rebase for the pcdb10.dat path
  read by _postcode_climate_table().
- tables/pcdb/etl.py __main__: same rebase for the manual ETL invocation
  (source + output_dir both now point inside the package).
- tests/test_pcdb_etl.py: _PCDB_DAT_PATH now derives from
  parents[1]/'tables'/'pcdb'/'data' (was parents[3]/'docs'/'sap-spec').

Citation rewrites:

- 12 .py docstrings and 4 .md docs (ADRs + READMEs + narrative docs)
  had `docs/sap-spec/<file>` strings rewritten to their new locations.
- Two cases where the catch-all sed misfired (an ADR-0009 line about a
  PCDB extract; the pcdb __init__.py docstring about ETL output) were
  hand-corrected to point at tables/pcdb/data/ rather than docs/specs/.

docs/sap-spec/ is now empty (will be removed in a follow-up sweep or
left as a vestigial empty dir for future repurposing). ADRs 0009 and
0010 remain at docs/adr/ — they're part of the chronological
cross-cutting decision log, not calculator-specific narrative.

Verified:

- Calculator's 1e-4 production gate
  (test_api_001479_full_chain_sap_matches_worksheet_pdf_exactly) GREEN.
- Wider sweep (domain/sap10_calculator/ + domain/sap10_ml/): 1654
  passed / 20 failed — exact pre-move baseline. All 20 failures
  pre-existing (10 hand-built skeleton + 4 cohort chain + 6 cohort
  diff).
- Pyright net-zero on the 4 touched runtime/test files (0 errors)
  and unchanged on heat_transmission.py (13) / cert_to_inputs.py (35) /
  mapper.py (33).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-26 13:17:18 +00:00

143 lines
11 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# SAP calculation domain
Per-section worksheet calculators for SAP 10.2 / RdSAP 10. Each file mirrors a numbered section of the spec; tests live alongside under `worksheet/tests/` and `tests/`.
```
sap/
├── calculator.py # top-level orchestrator → SapResult
├── worksheet/
│ ├── dimensions.py # §1 Overall dwelling dimensions
│ ├── ventilation.py # §2 Ventilation rate (+ RdSAP10 §4.1)
│ ├── heat_transmission.py # §3 Heat losses & HLP
│ ├── ... # §4 onward
│ └── tests/
│ ├── _xlsx_loader.py
│ ├── _elmhurst_fixtures.py # registry of Elmhurst conformance fixtures
│ ├── _elmhurst_worksheet_NNNNNN.py # one per worksheet pair
│ └── test_*.py
├── rdsap/ # cert → SapInputs cascade (RdSAP10 §5)
└── tables/ # Table U2 wind, Table 6 walls, Table 21 bridging, …
```
Spec references: `domain/sap10_calculator/docs/specs/sap-10-2-full-specification-2025-03-14.pdf` (SAP 10.2, the active target per ADR-0010), `domain/sap10_calculator/docs/specs/RdSAP 10 Specification 10-06-2025.pdf` (RdSAP cascade). Canonical worked example: `2026-05-19-17-18 RdSap10Worksheet.xlsx` at repo root — loaded by `_xlsx_loader.py`.
**Validation contract.** Per `[[feedback-zero-error-strict]]` the 6 Elmhurst U985 fixtures are deterministic test vectors: every line ref of every output must pin against the U985 PDF at `abs=1e-4`. See `worksheet/tests/test_section_cascade_pins.py` (per-section line refs, 768 rating + 90 demand pins) and `test_e2e_elmhurst_sap_score.py::test_sap_result_pin` (top-level SapResult fields). Tolerances are never widened. **Current state: 930/930 pins green.** The public API + architecture overview lives in `domain/sap10_calculator/docs/SAP_CALCULATOR.md`.
## Adding a new Elmhurst conformance fixture
Each Elmhurst fixture is a real-cert ground-truth: we encode the cert as `EpcPropertyData`, then assert our §1/§2/§3 output matches the lodged worksheet line-by-line. The fixtures act as a regression net for every cert-shape variation (RR, extension, party-wall code, sheltered sides, …) we've seen in the wild.
### Input: one PDF pair per cert
The assessor exports two PDFs from Elmhurst's RdSAP tool:
1. **`Summary_NNNNNN.pdf`** — the assessor's `RdSAP Inputs` form: property type, age band, dimensions, walls, roof, floors, windows, heating, ventilation. This is what we encode as `EpcPropertyData`.
2. **`UXXX-XXXX-NNNNNN.pdf`** — the calculator's full worksheet output: every populated line ref `(1a)..(486)` for the Energy Rating, EPC Costs, and Improved Dwelling variants. The Energy Rating variant (the first section) is canonical for line-ref tests.
`NNNNNN` is the cert's `Full RefNo` — both PDFs must match. Always capture from the **Energy Rating** section, not EPC Costs (the latter uses slightly different wind speeds for the BEDF fuel-price calc).
### Steps
1. **Drop a new fixture module** at `worksheet/tests/_elmhurst_worksheet_NNNNNN.py`. Copy the closest existing fixture as a starting template:
- 3-storey with room-in-roof → start from `_elmhurst_worksheet_000487.py` (RR + extension + alt wall) or `_elmhurst_worksheet_000477.py` (RR main-only)
- 2-storey with extension(s) → `_elmhurst_worksheet_000474.py` (Main + 2 ext, no RR) or `_elmhurst_worksheet_000480.py` (Main + 1 ext, with RR)
2. **Mirror the Summary PDF into `build_epc()`** — one `SapBuildingPart` per Main/Extension. Field-by-field correspondence; the docstring at the top of the fixture should call out the source PDF date and the cert's distinguishing features.
3. **Capture every populated worksheet line** as `LINE_NN_*` module-level constants. The cascade pin test (`test_section_cascade_pins.py`) parametrizes over `ALL_FIXTURES` and asserts each line individually at `abs=1e-4` against the actual `<section>_from_cert(epc)` output. Capture every line, scalar and monthly, all the way through §12 — the strict-pin sweep is the work in progress.
4. **Register the fixture** in `_elmhurst_fixtures.py`: add the import and append the module to `ALL_FIXTURES`.
5. **Run the conformance tests**:
```
python -m pytest domain/sap10_calculator/worksheet/tests/ \
-k elmhurst --no-cov -v
```
Each fixture appears 3× (one parametrize per section), pytest id = the cert ref number.
### Mapping the Summary PDF to `EpcPropertyData`
| Summary field | `EpcPropertyData` location | Notes |
|---|---|---|
| `Property type` | `epc.property_type` via `make_minimal_sap10_epc(...)` | drives mid/end/detached defaults |
| `Date Built` (per part) | `SapBuildingPart.construction_age_band` | one-letter A..M |
| `Storeys` | NOT a stored field — sum across `sap_floor_dimensions` + 1 if RR | §2 (9) uses dwelling *height*, not Σ across parts (LINE_9_STOREYS captures this) |
| `Floor Area` / `Room Height` / `Heat Loss Wall Perimeter` / `Party Wall Length` | one `SapFloorDimension` per storey of the part | see *Storey height convention* below |
| `Walls.Type` | `wall_construction` | 3=solid brick, 4=cavity, 5=timber frame, 6=system built |
| `Walls.Insulation` | `wall_insulation_type` | 4=as-built; 2=filled cavity |
| `Party Wall Type` | `party_wall_construction` | see *Party wall U mapping* below |
| `Roof.Type/Insulation/Thickness` | top-level `epc.roofs[0]` `EnergyElement` | RdSAP cascade reads description string |
| `Floors.Type/Insulation` | top-level `epc.floors[0]` | similar pattern |
| `Rooms in Roof` block | `SapBuildingPart.sap_room_in_roof = SapRoomInRoof(floor_area=...)` | see *Room-in-roof handling* |
| `Total Number of Doors` | `door_count=` on `make_minimal_sap10_epc` | |
| `Windows` table (each W×H + area) | one `SapWindow` per row in `epc.sap_windows`, with per-window `u_value` lodged when the cert names a U-value (mixed-glazing fixtures need this for the per-window curtain-resistance transform — slice 22). `make_window(..., u_value=...)` is the canonical helper. | |
| `Intermittent fans` | fixture constant `INTERMITTENT_FANS` (consumed by §2 test) | |
| `Draught Lobby` / `Draught Proofing %` | fixture constants `HAS_DRAUGHT_LOBBY`, `WINDOW_PCT_DRAUGHT_PROOFED` | |
| `Sheltered Sides` | fixture constant `LINE_19_SHELTERED_SIDES` (also asserted) | |
| `Mechanical Ventilation` | fixture constant `MV_KIND` | default `MechanicalVentilationKind.NATURAL` |
### Worksheet lines to capture
From the Energy Rating section's `1. Overall dwelling characteristics`:
- `LINE_4_TFA_M2` ← line `(4)` Total floor area
- `LINE_5_VOLUME_M3` ← line `(5)` Dwelling volume
From `2. Ventilation rate`:
- Scalars: `LINE_8` through `LINE_21` — every `(N)` line, including the pressure-test override `(18)` and shelter `(19)/(20)/(21)`
- Monthly tuples: `LINE_22_WIND_SPEED_M_S`, `LINE_22A_WIND_FACTOR`, `LINE_22B_WIND_ADJUSTED_ACH`, `LINE_25_EFFECTIVE_ACH` — twelve floats Jan..Dec
From `3. Heat losses and heat loss parameter`:
- `LINE_31_TOTAL_EXTERNAL_AREA_M2` ← `(31)` Σ A external elements (excludes party wall)
- `LINE_33_FABRIC_HEAT_LOSS_W_PER_K` ← `(33)` Σ (A × U) without bridging
- `LINE_36_THERMAL_BRIDGING_W_PER_K` ← `(36)` = y × (31)
- `LINE_37_TOTAL_FABRIC_HEAT_LOSS_W_PER_K` ← `(37)` = (33) + (36)
All four §3 aggregates are now pinned by `test_section_cascade_pins.py::test_section_3_line_refs_match_pdf` at `abs=1e-4`. RR detailed surfaces lodged via `SapRoomInRoof.detailed_surfaces` (slices 1323) close the room-in-roof breakdown end-to-end for every fixture with detailed §3.10 lodgement (000477, 000480, 000516; 000487 still has the U=0.86 external-gable variant pending spec input).
## Gotchas
### Storey height convention (`SapFloorDimension.room_height_m`)
The worksheet's `(2x)` height column includes a +0.25 m floor-structure allowance on every storey **above the lowest**:
- floor=0 (lowest): internal room height as measured
- floor=1 / floor=2 / …: internal room height + 0.25
So a 2.91 m upper-storey internal height appears on the worksheet as 3.16 m. Mirror the worksheet number into the fixture, not the surveyor's tape measurement.
### Room-in-roof
- §1 RdSAP `2.45 m` storey-height convention is hardcoded in `dimensions.py` regardless of any height the RR cert input claims. The worksheet line `(2d)` for an RR storey shows 2.45.
- We encode it as `SapBuildingPart.sap_room_in_roof = SapRoomInRoof(floor_area=..., detailed_surfaces=[...])`, NOT as a third `SapFloorDimension`. The dimensions calculator treats the RR as +1 storey, +floor_area to TFA, +floor_area × 2.45 to volume.
- §3.10 Detailed RR is implemented (slices 13, 16, 23). `SapRoomInRoofSurface` carries `kind` ∈ {`slope`, `flat_ceiling`, `stud_wall`, `gable_wall`}, `area_m2`, optional `insulation_thickness_mm` + `insulation_type`. Slope/flat_ceiling/stud_wall route to roof per Table 17; gable_wall routes to party at U=0.25 per Table 4 "as common wall". The U=0.86 "external gable" variant (000487) is NOT yet implemented — open ticket.
- Simplified Type 1 (RR lodged with only `floor_area`) still works via the spec's `A_RR = 12.5 × √(A_RR_floor/1.5)` formula at `u_rr_default_all_elements` (Table 18 col 4). Detailed lodgement supersedes when present.
### Party wall U mapping
`party_wall_construction` integer codes resolve via `domain.sap10_ml.rdsap_uvalues.u_party_wall`:
- `0` (Unknown / "Unable to determine") → 0.25 W/m²K
- `1` (Stone granite) / `3` (Solid brick) / `5` (Timber frame) / `6` (System built) → 0.0
- `4` (Cavity, unfilled) → 0.5
Cross-check against the worksheet's `Party walls Main` row in §3 — that's the authoritative U for the cert.
### Sheltered sides drives shelter factor
`(19)` varies per cert and the chain `(20) = 1 - 0.075 × (19)`, `(21) = (18) × (20)` propagates through every monthly `(22b)/(25)`. Read straight from the cert's `Sheltered Sides` field; not derivable from property type alone.
### `(12)` suspended-timber-floor quirk
Some Elmhurst certs list a suspended timber floor on the inputs but lodge `(12) = 0.0` in the worksheet. Mirror the worksheet, not the cert input: set `HAS_SUSPENDED_TIMBER_FLOOR=False` to get `(12)=0`. The `SUSPENDED_TIMBER_FLOOR_SEALED` flag only switches between `0.2` (unsealed) and `0.1` (sealed); it does not zero out the contribution. The `=True/=False` mapping in `ventilation.py:185`:
| `has_suspended_timber_floor` | `..._sealed` | resulting `(12)` |
|---|---|---|
| `False` | (any) | `0.0` |
| `True` | `False` | `0.2` |
| `True` | `True` | `0.1` |
### Effective monthly ACH `(25)` formula
Not equal to `(22b)` when `(22b) < 1.0`:
```
(25) = (22b) if (22b) ≥ 1.0
(25) = 0.5 + (22b)² × 0.5 otherwise
```
Don't try to compute it — read both `(22b)` and `(25)` straight off the worksheet and assert on both. The formula's here just so you recognise why they differ on tightly-sealed homes.
### Wind speeds: Energy Rating vs EPC Costs
The same cert prints two `Wind speed (22)` tables — one in `CALCULATION OF ENERGY RATING`, one in `CALCULATION OF EPC COSTS, EMISSIONS AND PRIMARY ENERGY`. They differ (the latter is the BEDF-prices variant). Always capture from the Energy Rating section; that's what `ventilation_from_inputs(...)` calibrates against. The non-regional Table U2 default values are `5.1, 5.0, 4.9, 4.4, 4.3, 3.8, 3.8, 3.7, 4.0, 4.3, 4.5, 4.7`.