Model/domain/sap10_calculator
Khalim Conn-Kowlessar 69668ec634 Slice S0380.16: add 'Normal' → cylinder_size=2 (110 L) for cohort 2
Unblocks two 38-cert-cohort certs that previously raised
`UnmappedElmhurstLabel("cylinder_size", 'Normal')` at extraction:
  cert 2536-2525-0600-0788-2292  ws SAP=79.7264
  cert 9421-3045-3205-1646-6200  ws SAP=87.4495

Both Summary §15.1 lodgements read "Cylinder Size: Normal"; both dr87
worksheets lodge line ref (47) "Store volume = 110.0000" L (extracted
from `Hot Water Cylinder → Cylinder Volume 110.00`). RdSAP 10 §10.5
Table 28 documents the "Normal (90-130 litres)" descriptor whose
midpoint is 110 L — the canonical Elmhurst label string in
`datatypes/epc/surveys/elmhurst_site_notes.py` is "Normal (90-130
litres)", and the worksheet's exact 110 L matches the midpoint.

Two-line fix:
  +    "Normal": 2,           in `_ELMHURST_CYLINDER_SIZE_LABEL_TO_SAP10`
  +    2: 110.0,              in `_CYLINDER_SIZE_CODE_TO_LITRES`

The cascade enum 2 is consistent with the existing
`cert_to_inputs.py` docstring's documented (but not-yet-observed)
code 2 → Normal slot, alongside code 3 (Medium / 160 L) and code 4
(Large / 210 L) added in earlier slices.

Slice keeps tight: two mapping unit tests pinning `cylinder_size == 2`
for both certs at extraction. Post-fix the first-attempt cascade
deltas vs worksheet are:
  cert 2536  Δ +0.0244   (was: RAISES)
  cert 9421  Δ +0.0296   (was: RAISES)

Both deltas now sit in the same systematic +0.02..+0.07 small-gap
band as ~12 other first-attempt certs in cohort 2 — chain test +
±0.07 pin would just paper over a known systematic residual that the
user has explicitly asked to drive towards 1e-4, not toward ±0.07.
Following slice will investigate the shared systematic offset and
close cert 2536 / 9421 along with the rest of the +0.04 band on
the chain.

Pyright net-zero per file:
  - datatypes/epc/domain/mapper.py: 32 (baseline 32)
  - domain/sap10_calculator/rdsap/cert_to_inputs.py: 35 (baseline 35)
  - backend/documents_parser/tests/test_summary_pdf_mapper_chain.py: 0

Regression baseline: 691 pass + 10 fail (= prior 689 + 10 + 2 new GREEN).

Spec refs:
- RdSAP 10 §10.5 Table 28 — "Cylinder Volume" Normal band 90-130 L,
  midpoint 110 L (also the canonical Elmhurst label suffix).
- Cert 2536 worksheet `dr87-0001-000889.pdf` line ref (47) = 110.0000.
- Cert 9421 worksheet `dr87-0001-000884.pdf` line ref (47) = 110.0000.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-01 16:28:46 +00:00
..
climate refactor: lift-and-shift packages/domain/src/domain/sap → domain/sap10_calculator 2026-05-26 12:22:37 +00:00
docs docs: handover — Summary + API cohort expansion to 38 additional certs 2026-06-01 16:28:46 +00:00
rdsap Slice S0380.16: add 'Normal' → cylinder_size=2 (110 L) for cohort 2 2026-06-01 16:28:46 +00:00
tables Slice 102f-prep.1: PCDB Table 362 heating_duration_code field 2026-06-01 16:28:46 +00:00
tests Slice 102f-prep.1: PCDB Table 362 heating_duration_code field 2026-06-01 16:28:46 +00:00
validation refactor: lift-and-shift packages/domain/src/domain/sap → domain/sap10_calculator 2026-05-26 12:22:37 +00:00
worksheet Slice S0380.13: widen cantilever gate to accept "House" descriptive form 2026-06-01 16:28:46 +00:00
__init__.py refactor: lift-and-shift packages/domain/src/domain/sap → domain/sap10_calculator 2026-05-26 12:22:37 +00:00
calculator.py refactor: lift-and-shift packages/domain/src/domain/sap → domain/sap10_calculator 2026-05-26 12:22:37 +00:00
README.md refactor: move docs/sap-spec/ contents into domain/sap10_calculator/ 2026-05-26 13:17:18 +00:00

SAP calculation domain

Per-section worksheet calculators for SAP 10.2 / RdSAP 10. Each file mirrors a numbered section of the spec; tests live alongside under worksheet/tests/ and tests/.

sap/
├── calculator.py            # top-level orchestrator → SapResult
├── worksheet/
│   ├── dimensions.py         # §1 Overall dwelling dimensions
│   ├── ventilation.py        # §2 Ventilation rate (+ RdSAP10 §4.1)
│   ├── heat_transmission.py  # §3 Heat losses & HLP
│   ├── ...                   # §4 onward
│   └── tests/
│       ├── _xlsx_loader.py
│       ├── _elmhurst_fixtures.py        # registry of Elmhurst conformance fixtures
│       ├── _elmhurst_worksheet_NNNNNN.py # one per worksheet pair
│       └── test_*.py
├── rdsap/                   # cert → SapInputs cascade (RdSAP10 §5)
└── tables/                  # Table U2 wind, Table 6 walls, Table 21 bridging, …

Spec references: domain/sap10_calculator/docs/specs/sap-10-2-full-specification-2025-03-14.pdf (SAP 10.2, the active target per ADR-0010), domain/sap10_calculator/docs/specs/RdSAP 10 Specification 10-06-2025.pdf (RdSAP cascade). Canonical worked example: 2026-05-19-17-18 RdSap10Worksheet.xlsx at repo root — loaded by _xlsx_loader.py.

Validation contract. Per [[feedback-zero-error-strict]] the 6 Elmhurst U985 fixtures are deterministic test vectors: every line ref of every output must pin against the U985 PDF at abs=1e-4. See worksheet/tests/test_section_cascade_pins.py (per-section line refs, 768 rating + 90 demand pins) and test_e2e_elmhurst_sap_score.py::test_sap_result_pin (top-level SapResult fields). Tolerances are never widened. Current state: 930/930 pins green. The public API + architecture overview lives in domain/sap10_calculator/docs/SAP_CALCULATOR.md.

Adding a new Elmhurst conformance fixture

Each Elmhurst fixture is a real-cert ground-truth: we encode the cert as EpcPropertyData, then assert our §1/§2/§3 output matches the lodged worksheet line-by-line. The fixtures act as a regression net for every cert-shape variation (RR, extension, party-wall code, sheltered sides, …) we've seen in the wild.

Input: one PDF pair per cert

The assessor exports two PDFs from Elmhurst's RdSAP tool:

  1. Summary_NNNNNN.pdf — the assessor's RdSAP Inputs form: property type, age band, dimensions, walls, roof, floors, windows, heating, ventilation. This is what we encode as EpcPropertyData.
  2. UXXX-XXXX-NNNNNN.pdf — the calculator's full worksheet output: every populated line ref (1a)..(486) for the Energy Rating, EPC Costs, and Improved Dwelling variants. The Energy Rating variant (the first section) is canonical for line-ref tests.

NNNNNN is the cert's Full RefNo — both PDFs must match. Always capture from the Energy Rating section, not EPC Costs (the latter uses slightly different wind speeds for the BEDF fuel-price calc).

Steps

  1. Drop a new fixture module at worksheet/tests/_elmhurst_worksheet_NNNNNN.py. Copy the closest existing fixture as a starting template:

    • 3-storey with room-in-roof → start from _elmhurst_worksheet_000487.py (RR + extension + alt wall) or _elmhurst_worksheet_000477.py (RR main-only)
    • 2-storey with extension(s) → _elmhurst_worksheet_000474.py (Main + 2 ext, no RR) or _elmhurst_worksheet_000480.py (Main + 1 ext, with RR)
  2. Mirror the Summary PDF into build_epc() — one SapBuildingPart per Main/Extension. Field-by-field correspondence; the docstring at the top of the fixture should call out the source PDF date and the cert's distinguishing features.

  3. Capture every populated worksheet line as LINE_NN_* module-level constants. The cascade pin test (test_section_cascade_pins.py) parametrizes over ALL_FIXTURES and asserts each line individually at abs=1e-4 against the actual <section>_from_cert(epc) output. Capture every line, scalar and monthly, all the way through §12 — the strict-pin sweep is the work in progress.

  4. Register the fixture in _elmhurst_fixtures.py: add the import and append the module to ALL_FIXTURES.

  5. Run the conformance tests:

    python -m pytest domain/sap10_calculator/worksheet/tests/ \
        -k elmhurst --no-cov -v
    

    Each fixture appears 3× (one parametrize per section), pytest id = the cert ref number.

Mapping the Summary PDF to EpcPropertyData

Summary field EpcPropertyData location Notes
Property type epc.property_type via make_minimal_sap10_epc(...) drives mid/end/detached defaults
Date Built (per part) SapBuildingPart.construction_age_band one-letter A..M
Storeys NOT a stored field — sum across sap_floor_dimensions + 1 if RR §2 (9) uses dwelling height, not Σ across parts (LINE_9_STOREYS captures this)
Floor Area / Room Height / Heat Loss Wall Perimeter / Party Wall Length one SapFloorDimension per storey of the part see Storey height convention below
Walls.Type wall_construction 3=solid brick, 4=cavity, 5=timber frame, 6=system built
Walls.Insulation wall_insulation_type 4=as-built; 2=filled cavity
Party Wall Type party_wall_construction see Party wall U mapping below
Roof.Type/Insulation/Thickness top-level epc.roofs[0] EnergyElement RdSAP cascade reads description string
Floors.Type/Insulation top-level epc.floors[0] similar pattern
Rooms in Roof block SapBuildingPart.sap_room_in_roof = SapRoomInRoof(floor_area=...) see Room-in-roof handling
Total Number of Doors door_count= on make_minimal_sap10_epc
Windows table (each W×H + area) one SapWindow per row in epc.sap_windows, with per-window u_value lodged when the cert names a U-value (mixed-glazing fixtures need this for the per-window curtain-resistance transform — slice 22). make_window(..., u_value=...) is the canonical helper.
Intermittent fans fixture constant INTERMITTENT_FANS (consumed by §2 test)
Draught Lobby / Draught Proofing % fixture constants HAS_DRAUGHT_LOBBY, WINDOW_PCT_DRAUGHT_PROOFED
Sheltered Sides fixture constant LINE_19_SHELTERED_SIDES (also asserted)
Mechanical Ventilation fixture constant MV_KIND default MechanicalVentilationKind.NATURAL

Worksheet lines to capture

From the Energy Rating section's 1. Overall dwelling characteristics:

  • LINE_4_TFA_M2 ← line (4) Total floor area
  • LINE_5_VOLUME_M3 ← line (5) Dwelling volume

From 2. Ventilation rate:

  • Scalars: LINE_8 through LINE_21 — every (N) line, including the pressure-test override (18) and shelter (19)/(20)/(21)
  • Monthly tuples: LINE_22_WIND_SPEED_M_S, LINE_22A_WIND_FACTOR, LINE_22B_WIND_ADJUSTED_ACH, LINE_25_EFFECTIVE_ACH — twelve floats Jan..Dec

From 3. Heat losses and heat loss parameter:

  • LINE_31_TOTAL_EXTERNAL_AREA_M2(31) Σ A external elements (excludes party wall)
  • LINE_33_FABRIC_HEAT_LOSS_W_PER_K(33) Σ (A × U) without bridging
  • LINE_36_THERMAL_BRIDGING_W_PER_K(36) = y × (31)
  • LINE_37_TOTAL_FABRIC_HEAT_LOSS_W_PER_K(37) = (33) + (36)

All four §3 aggregates are now pinned by test_section_cascade_pins.py::test_section_3_line_refs_match_pdf at abs=1e-4. RR detailed surfaces lodged via SapRoomInRoof.detailed_surfaces (slices 1323) close the room-in-roof breakdown end-to-end for every fixture with detailed §3.10 lodgement (000477, 000480, 000516; 000487 still has the U=0.86 external-gable variant pending spec input).

Gotchas

Storey height convention (SapFloorDimension.room_height_m)

The worksheet's (2x) height column includes a +0.25 m floor-structure allowance on every storey above the lowest:

  • floor=0 (lowest): internal room height as measured
  • floor=1 / floor=2 / …: internal room height + 0.25

So a 2.91 m upper-storey internal height appears on the worksheet as 3.16 m. Mirror the worksheet number into the fixture, not the surveyor's tape measurement.

Room-in-roof

  • §1 RdSAP 2.45 m storey-height convention is hardcoded in dimensions.py regardless of any height the RR cert input claims. The worksheet line (2d) for an RR storey shows 2.45.
  • We encode it as SapBuildingPart.sap_room_in_roof = SapRoomInRoof(floor_area=..., detailed_surfaces=[...]), NOT as a third SapFloorDimension. The dimensions calculator treats the RR as +1 storey, +floor_area to TFA, +floor_area × 2.45 to volume.
  • §3.10 Detailed RR is implemented (slices 13, 16, 23). SapRoomInRoofSurface carries kind ∈ {slope, flat_ceiling, stud_wall, gable_wall}, area_m2, optional insulation_thickness_mm + insulation_type. Slope/flat_ceiling/stud_wall route to roof per Table 17; gable_wall routes to party at U=0.25 per Table 4 "as common wall". The U=0.86 "external gable" variant (000487) is NOT yet implemented — open ticket.
  • Simplified Type 1 (RR lodged with only floor_area) still works via the spec's A_RR = 12.5 × √(A_RR_floor/1.5) formula at u_rr_default_all_elements (Table 18 col 4). Detailed lodgement supersedes when present.

Party wall U mapping

party_wall_construction integer codes resolve via domain.sap10_ml.rdsap_uvalues.u_party_wall:

  • 0 (Unknown / "Unable to determine") → 0.25 W/m²K
  • 1 (Stone granite) / 3 (Solid brick) / 5 (Timber frame) / 6 (System built) → 0.0
  • 4 (Cavity, unfilled) → 0.5

Cross-check against the worksheet's Party walls Main row in §3 — that's the authoritative U for the cert.

Sheltered sides drives shelter factor

(19) varies per cert and the chain (20) = 1 - 0.075 × (19), (21) = (18) × (20) propagates through every monthly (22b)/(25). Read straight from the cert's Sheltered Sides field; not derivable from property type alone.

(12) suspended-timber-floor quirk

Some Elmhurst certs list a suspended timber floor on the inputs but lodge (12) = 0.0 in the worksheet. Mirror the worksheet, not the cert input: set HAS_SUSPENDED_TIMBER_FLOOR=False to get (12)=0. The SUSPENDED_TIMBER_FLOOR_SEALED flag only switches between 0.2 (unsealed) and 0.1 (sealed); it does not zero out the contribution. The =True/=False mapping in ventilation.py:185:

has_suspended_timber_floor ..._sealed resulting (12)
False (any) 0.0
True False 0.2
True True 0.1

Effective monthly ACH (25) formula

Not equal to (22b) when (22b) < 1.0:

(25) = (22b)                              if (22b) ≥ 1.0
(25) = 0.5 + (22b)² × 0.5                  otherwise

Don't try to compute it — read both (22b) and (25) straight off the worksheet and assert on both. The formula's here just so you recognise why they differ on tightly-sealed homes.

Wind speeds: Energy Rating vs EPC Costs

The same cert prints two Wind speed (22) tables — one in CALCULATION OF ENERGY RATING, one in CALCULATION OF EPC COSTS, EMISSIONS AND PRIMARY ENERGY. They differ (the latter is the BEDF-prices variant). Always capture from the Energy Rating section; that's what ventilation_from_inputs(...) calibrates against. The non-regional Table U2 default values are 5.1, 5.0, 4.9, 4.4, 4.3, 3.8, 3.8, 3.7, 4.0, 4.3, 4.5, 4.7.