Model/domain/sap10_calculator/README.md
Khalim Conn-Kowlessar a7b08a4e8f refactor: move docs/sap-spec/ contents into domain/sap10_calculator/
Locality of reference — SAP-specific docs, specs, and runtime data
now live alongside the calculator that consumes them, mirroring the
prior packages→domain layout moves.

Move targets:

- Narrative MDs → domain/sap10_calculator/docs/
    NEXT_AGENT_PROMPT.md, HANDOVER_NEXT.md, SAP_CALCULATOR.md
- Spec PDFs → domain/sap10_calculator/docs/specs/
    RdSAP 10 Specification 10-06-2025.pdf
    PCDF_Spec_Rev-06b_12_May_2021.pdf
    sap-10-2-full-specification-2025-03-14.pdf
    sap-10-3-full-specification-2026-01-13.pdf
- PCDB runtime data → domain/sap10_calculator/tables/pcdb/data/
    pcdb10.dat (8.3MB) + 7× pcdb_table_*.jsonl (18MB total)

Path code rewrites (load-bearing):

- tables/pcdb/__init__.py: replaced parents[4]/'docs'/'sap-spec' with
  Path(__file__).resolve().parent/'data' for Table 105 JSONL loading.
- tables/pcdb/postcode_weather.py: same rebase for the pcdb10.dat path
  read by _postcode_climate_table().
- tables/pcdb/etl.py __main__: same rebase for the manual ETL invocation
  (source + output_dir both now point inside the package).
- tests/test_pcdb_etl.py: _PCDB_DAT_PATH now derives from
  parents[1]/'tables'/'pcdb'/'data' (was parents[3]/'docs'/'sap-spec').

Citation rewrites:

- 12 .py docstrings and 4 .md docs (ADRs + READMEs + narrative docs)
  had `docs/sap-spec/<file>` strings rewritten to their new locations.
- Two cases where the catch-all sed misfired (an ADR-0009 line about a
  PCDB extract; the pcdb __init__.py docstring about ETL output) were
  hand-corrected to point at tables/pcdb/data/ rather than docs/specs/.

docs/sap-spec/ is now empty (will be removed in a follow-up sweep or
left as a vestigial empty dir for future repurposing). ADRs 0009 and
0010 remain at docs/adr/ — they're part of the chronological
cross-cutting decision log, not calculator-specific narrative.

Verified:

- Calculator's 1e-4 production gate
  (test_api_001479_full_chain_sap_matches_worksheet_pdf_exactly) GREEN.
- Wider sweep (domain/sap10_calculator/ + domain/sap10_ml/): 1654
  passed / 20 failed — exact pre-move baseline. All 20 failures
  pre-existing (10 hand-built skeleton + 4 cohort chain + 6 cohort
  diff).
- Pyright net-zero on the 4 touched runtime/test files (0 errors)
  and unchanged on heat_transmission.py (13) / cert_to_inputs.py (35) /
  mapper.py (33).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-26 13:17:18 +00:00

11 KiB
Raw Blame History

SAP calculation domain

Per-section worksheet calculators for SAP 10.2 / RdSAP 10. Each file mirrors a numbered section of the spec; tests live alongside under worksheet/tests/ and tests/.

sap/
├── calculator.py            # top-level orchestrator → SapResult
├── worksheet/
│   ├── dimensions.py         # §1 Overall dwelling dimensions
│   ├── ventilation.py        # §2 Ventilation rate (+ RdSAP10 §4.1)
│   ├── heat_transmission.py  # §3 Heat losses & HLP
│   ├── ...                   # §4 onward
│   └── tests/
│       ├── _xlsx_loader.py
│       ├── _elmhurst_fixtures.py        # registry of Elmhurst conformance fixtures
│       ├── _elmhurst_worksheet_NNNNNN.py # one per worksheet pair
│       └── test_*.py
├── rdsap/                   # cert → SapInputs cascade (RdSAP10 §5)
└── tables/                  # Table U2 wind, Table 6 walls, Table 21 bridging, …

Spec references: domain/sap10_calculator/docs/specs/sap-10-2-full-specification-2025-03-14.pdf (SAP 10.2, the active target per ADR-0010), domain/sap10_calculator/docs/specs/RdSAP 10 Specification 10-06-2025.pdf (RdSAP cascade). Canonical worked example: 2026-05-19-17-18 RdSap10Worksheet.xlsx at repo root — loaded by _xlsx_loader.py.

Validation contract. Per [[feedback-zero-error-strict]] the 6 Elmhurst U985 fixtures are deterministic test vectors: every line ref of every output must pin against the U985 PDF at abs=1e-4. See worksheet/tests/test_section_cascade_pins.py (per-section line refs, 768 rating + 90 demand pins) and test_e2e_elmhurst_sap_score.py::test_sap_result_pin (top-level SapResult fields). Tolerances are never widened. Current state: 930/930 pins green. The public API + architecture overview lives in domain/sap10_calculator/docs/SAP_CALCULATOR.md.

Adding a new Elmhurst conformance fixture

Each Elmhurst fixture is a real-cert ground-truth: we encode the cert as EpcPropertyData, then assert our §1/§2/§3 output matches the lodged worksheet line-by-line. The fixtures act as a regression net for every cert-shape variation (RR, extension, party-wall code, sheltered sides, …) we've seen in the wild.

Input: one PDF pair per cert

The assessor exports two PDFs from Elmhurst's RdSAP tool:

  1. Summary_NNNNNN.pdf — the assessor's RdSAP Inputs form: property type, age band, dimensions, walls, roof, floors, windows, heating, ventilation. This is what we encode as EpcPropertyData.
  2. UXXX-XXXX-NNNNNN.pdf — the calculator's full worksheet output: every populated line ref (1a)..(486) for the Energy Rating, EPC Costs, and Improved Dwelling variants. The Energy Rating variant (the first section) is canonical for line-ref tests.

NNNNNN is the cert's Full RefNo — both PDFs must match. Always capture from the Energy Rating section, not EPC Costs (the latter uses slightly different wind speeds for the BEDF fuel-price calc).

Steps

  1. Drop a new fixture module at worksheet/tests/_elmhurst_worksheet_NNNNNN.py. Copy the closest existing fixture as a starting template:

    • 3-storey with room-in-roof → start from _elmhurst_worksheet_000487.py (RR + extension + alt wall) or _elmhurst_worksheet_000477.py (RR main-only)
    • 2-storey with extension(s) → _elmhurst_worksheet_000474.py (Main + 2 ext, no RR) or _elmhurst_worksheet_000480.py (Main + 1 ext, with RR)
  2. Mirror the Summary PDF into build_epc() — one SapBuildingPart per Main/Extension. Field-by-field correspondence; the docstring at the top of the fixture should call out the source PDF date and the cert's distinguishing features.

  3. Capture every populated worksheet line as LINE_NN_* module-level constants. The cascade pin test (test_section_cascade_pins.py) parametrizes over ALL_FIXTURES and asserts each line individually at abs=1e-4 against the actual <section>_from_cert(epc) output. Capture every line, scalar and monthly, all the way through §12 — the strict-pin sweep is the work in progress.

  4. Register the fixture in _elmhurst_fixtures.py: add the import and append the module to ALL_FIXTURES.

  5. Run the conformance tests:

    python -m pytest domain/sap10_calculator/worksheet/tests/ \
        -k elmhurst --no-cov -v
    

    Each fixture appears 3× (one parametrize per section), pytest id = the cert ref number.

Mapping the Summary PDF to EpcPropertyData

Summary field EpcPropertyData location Notes
Property type epc.property_type via make_minimal_sap10_epc(...) drives mid/end/detached defaults
Date Built (per part) SapBuildingPart.construction_age_band one-letter A..M
Storeys NOT a stored field — sum across sap_floor_dimensions + 1 if RR §2 (9) uses dwelling height, not Σ across parts (LINE_9_STOREYS captures this)
Floor Area / Room Height / Heat Loss Wall Perimeter / Party Wall Length one SapFloorDimension per storey of the part see Storey height convention below
Walls.Type wall_construction 3=solid brick, 4=cavity, 5=timber frame, 6=system built
Walls.Insulation wall_insulation_type 4=as-built; 2=filled cavity
Party Wall Type party_wall_construction see Party wall U mapping below
Roof.Type/Insulation/Thickness top-level epc.roofs[0] EnergyElement RdSAP cascade reads description string
Floors.Type/Insulation top-level epc.floors[0] similar pattern
Rooms in Roof block SapBuildingPart.sap_room_in_roof = SapRoomInRoof(floor_area=...) see Room-in-roof handling
Total Number of Doors door_count= on make_minimal_sap10_epc
Windows table (each W×H + area) one SapWindow per row in epc.sap_windows, with per-window u_value lodged when the cert names a U-value (mixed-glazing fixtures need this for the per-window curtain-resistance transform — slice 22). make_window(..., u_value=...) is the canonical helper.
Intermittent fans fixture constant INTERMITTENT_FANS (consumed by §2 test)
Draught Lobby / Draught Proofing % fixture constants HAS_DRAUGHT_LOBBY, WINDOW_PCT_DRAUGHT_PROOFED
Sheltered Sides fixture constant LINE_19_SHELTERED_SIDES (also asserted)
Mechanical Ventilation fixture constant MV_KIND default MechanicalVentilationKind.NATURAL

Worksheet lines to capture

From the Energy Rating section's 1. Overall dwelling characteristics:

  • LINE_4_TFA_M2 ← line (4) Total floor area
  • LINE_5_VOLUME_M3 ← line (5) Dwelling volume

From 2. Ventilation rate:

  • Scalars: LINE_8 through LINE_21 — every (N) line, including the pressure-test override (18) and shelter (19)/(20)/(21)
  • Monthly tuples: LINE_22_WIND_SPEED_M_S, LINE_22A_WIND_FACTOR, LINE_22B_WIND_ADJUSTED_ACH, LINE_25_EFFECTIVE_ACH — twelve floats Jan..Dec

From 3. Heat losses and heat loss parameter:

  • LINE_31_TOTAL_EXTERNAL_AREA_M2(31) Σ A external elements (excludes party wall)
  • LINE_33_FABRIC_HEAT_LOSS_W_PER_K(33) Σ (A × U) without bridging
  • LINE_36_THERMAL_BRIDGING_W_PER_K(36) = y × (31)
  • LINE_37_TOTAL_FABRIC_HEAT_LOSS_W_PER_K(37) = (33) + (36)

All four §3 aggregates are now pinned by test_section_cascade_pins.py::test_section_3_line_refs_match_pdf at abs=1e-4. RR detailed surfaces lodged via SapRoomInRoof.detailed_surfaces (slices 1323) close the room-in-roof breakdown end-to-end for every fixture with detailed §3.10 lodgement (000477, 000480, 000516; 000487 still has the U=0.86 external-gable variant pending spec input).

Gotchas

Storey height convention (SapFloorDimension.room_height_m)

The worksheet's (2x) height column includes a +0.25 m floor-structure allowance on every storey above the lowest:

  • floor=0 (lowest): internal room height as measured
  • floor=1 / floor=2 / …: internal room height + 0.25

So a 2.91 m upper-storey internal height appears on the worksheet as 3.16 m. Mirror the worksheet number into the fixture, not the surveyor's tape measurement.

Room-in-roof

  • §1 RdSAP 2.45 m storey-height convention is hardcoded in dimensions.py regardless of any height the RR cert input claims. The worksheet line (2d) for an RR storey shows 2.45.
  • We encode it as SapBuildingPart.sap_room_in_roof = SapRoomInRoof(floor_area=..., detailed_surfaces=[...]), NOT as a third SapFloorDimension. The dimensions calculator treats the RR as +1 storey, +floor_area to TFA, +floor_area × 2.45 to volume.
  • §3.10 Detailed RR is implemented (slices 13, 16, 23). SapRoomInRoofSurface carries kind ∈ {slope, flat_ceiling, stud_wall, gable_wall}, area_m2, optional insulation_thickness_mm + insulation_type. Slope/flat_ceiling/stud_wall route to roof per Table 17; gable_wall routes to party at U=0.25 per Table 4 "as common wall". The U=0.86 "external gable" variant (000487) is NOT yet implemented — open ticket.
  • Simplified Type 1 (RR lodged with only floor_area) still works via the spec's A_RR = 12.5 × √(A_RR_floor/1.5) formula at u_rr_default_all_elements (Table 18 col 4). Detailed lodgement supersedes when present.

Party wall U mapping

party_wall_construction integer codes resolve via domain.sap10_ml.rdsap_uvalues.u_party_wall:

  • 0 (Unknown / "Unable to determine") → 0.25 W/m²K
  • 1 (Stone granite) / 3 (Solid brick) / 5 (Timber frame) / 6 (System built) → 0.0
  • 4 (Cavity, unfilled) → 0.5

Cross-check against the worksheet's Party walls Main row in §3 — that's the authoritative U for the cert.

Sheltered sides drives shelter factor

(19) varies per cert and the chain (20) = 1 - 0.075 × (19), (21) = (18) × (20) propagates through every monthly (22b)/(25). Read straight from the cert's Sheltered Sides field; not derivable from property type alone.

(12) suspended-timber-floor quirk

Some Elmhurst certs list a suspended timber floor on the inputs but lodge (12) = 0.0 in the worksheet. Mirror the worksheet, not the cert input: set HAS_SUSPENDED_TIMBER_FLOOR=False to get (12)=0. The SUSPENDED_TIMBER_FLOOR_SEALED flag only switches between 0.2 (unsealed) and 0.1 (sealed); it does not zero out the contribution. The =True/=False mapping in ventilation.py:185:

has_suspended_timber_floor ..._sealed resulting (12)
False (any) 0.0
True False 0.2
True True 0.1

Effective monthly ACH (25) formula

Not equal to (22b) when (22b) < 1.0:

(25) = (22b)                              if (22b) ≥ 1.0
(25) = 0.5 + (22b)² × 0.5                  otherwise

Don't try to compute it — read both (22b) and (25) straight off the worksheet and assert on both. The formula's here just so you recognise why they differ on tightly-sealed homes.

Wind speeds: Energy Rating vs EPC Costs

The same cert prints two Wind speed (22) tables — one in CALCULATION OF ENERGY RATING, one in CALCULATION OF EPC COSTS, EMISSIONS AND PRIMARY ENERGY. They differ (the latter is the BEDF-prices variant). Always capture from the Energy Rating section; that's what ventilation_from_inputs(...) calibrates against. The non-regional Table U2 default values are 5.1, 5.0, 4.9, 4.4, 4.3, 3.8, 3.8, 3.7, 4.0, 4.3, 4.5, 4.7.