Commit graph

2435 commits

Author SHA1 Message Date
KhalimCK
f7f74ea72b
Merge pull request #1288 from Hestia-Homes/feature/per-cert-mapper-validation
Feature/per cert mapper validation
2026-06-24 12:04:00 +01:00
Khalim Conn-Kowlessar
4e2f2bdcc7 test(worksheet): pin simulated case 50 — MVHR + dual-immersion all-electric
Adds the mapper-driven e2e cascade pin for "simulated case 50" (000565 semi,
electric storage main SAP 402 + portable electric secondary + MVHR + whc-903
DUAL electric immersion + 160 L cylinder, Economy-7). Routes the Summary PDF
through extractor + mapper + calculator like the other 000565 fixtures.

Locks in two off-peak fixes this case ground-truthed:
- the Table 13 HW high/low split applied to CO2/PE (commit 39ae2cf0), and
- the Table 12a Grid 2 MVHR fan fraction 0.71/0.58 (commit cd5113ab).

All 11 SAP-result fields reconcile to the U985 worksheet EXACTLY, including
the (272) rating CO2 2397.1237 — SAP 38.8426 (=39), cost £1317.0116, water
1668.0788 kWh, fans 315.6384 kWh.

Summary mirrored to the tracked fixtures dir so the test doesn't depend on
the unstaged `sap worksheets/` workspace.

pyright strict gate not run locally (pyright not installed in this container).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 09:07:16 +00:00
Khalim Conn-Kowlessar
f3e3494bf7 test(worksheet): pin simulated case 52 — regular gas boiler + cylinder
Adds the mapper-driven e2e cascade pin for "simulated case 52" (000565
semi + regular non-combi mains-gas boiler SAP 102 + 160 L foam cylinder
heated from the main, no cylinder stat, uninsulated primary pipework,
standard tariff). Routes the Summary PDF through extractor + mapper +
calculator like the other 000565 / 001431_case* fixtures.

This closes the last untested branch of the cylinder/water chain: the
SAP 10.2 §4 cylinder storage loss (Table 2/2a/2b lines 51-55) + the
Table 3 PRIMARY circuit loss (59, uninsulated pipework + no stat) that
combi/immersion fixtures don't reach. All 11 SAP-result fields reconcile
to the U985 worksheet EXACTLY with no calculator change — SAP 57.2904
(=57), cost £911.1973, water 3929.7635 kWh — confirming the cylinder-loss
derivation is correct.

Summary mirrored to the tracked fixtures dir so the test doesn't depend
on the unstaged `sap worksheets/` workspace.

pyright strict gate not run locally (pyright not installed in this container).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-24 08:50:19 +00:00
Jun-te Kim
8f0432721c test(accuracy): pin RdSAP-20.0.0 PV semi uprn_22086693 (PV-list fix corpus)
Corpus validation of the modelling_e2e photovoltaic_supply-as-list fix. Cert
6102-6227-8000-0083-2292 (RdSAP-20.0.0 semi, gas combi + 2× 1.14 kW PV arrays)
crashed from_rdsap_schema_20_0_0 on the measured-array list; the fix routes it
through the dict-tolerant _map_schema_21_pv. PV correctly credited: engine 61
(no PV) → 66 (+5). Built in Elmhurst (evidence: epc.json + summary + worksheet,
fabric+heating; the PV "New Technologies" Panel-details grid deferred): worksheet
55 = engine-on-Elmhurst-inputs 55 exactly → calculator faithful. The +6 engine-vs-
Elmhurst base-dwelling residual is the documented RdSAP-default gap (band-C cavity-
uninsulated suspended-floor semi). Pinned engine 66.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 16:13:34 +00:00
Jun-te Kim
75341da42d 32 2026-06-23 15:38:38 +00:00
Jun-te Kim
17b1d63f0e test(accuracy): pin SAP-16.0 storage-flat uprn_10070004512 (built_form fix corpus)
Corpus validation of the modelling_e2e built_form fix. Cert 8742-6624-9300-2780-4926
(SAP-Schema-16.0, ground-floor electric-storage-heater flat) omits built_form; the
mapper now derives it from dwelling_type. built_form is ML-only so the fix is
SAP-neutral: engine 66 = lodged 66 exactly. Built in Elmhurst (evidence: epc.json +
summary + worksheet): worksheet 54, engine-on-Elmhurst-inputs 53 ≈ 54 → calculator
faithful. The +12 engine-vs-Elmhurst is a build/input gap (cert size-1 small cylinder
unrepresentable in Elmhurst's Normal/110L-minimum entry → higher HW + reduced-field
16.0 defaults). Pinned engine 66.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-23 15:35:06 +00:00
Jun-te Kim
00af7b5a54 data types 2026-06-23 12:42:53 +00:00
Jun-te Kim
210ca6397f updated sap scaema to take in inputs 2026-06-23 10:22:40 +00:00
Jun-te Kim
9c89a0e680 neighbouring properties added 2026-06-22 14:38:00 +00:00
Jun-te Kim
3044c70202 sap score and elmhirst mapper optimsaiation 2026-06-20 07:25:42 +00:00
Jun-te Kim
37b0a38425 add more test cases 2026-06-19 10:59:51 +00:00
Jun-te Kim
e6a829aaea more examples 2026-06-19 09:51:49 +00:00
Jun-te Kim
269a7fdaa7 Merge branch 'main' of https://github.com/Hestia-Homes/Model into feature/hyde_make_it_more_accurate_with_tests 2026-06-18 16:24:26 +00:00
Khalim Conn-Kowlessar
a7de8c5c35 Merge remote-tracking branch 'origin/main' into feature/per-cert-mapper-validation 2026-06-18 15:00:32 +00:00
Khalim Conn-Kowlessar
fc7c4d2d3b fix(climate): compute EPC CO2/PE on the postcode demand cascade (SAP 10.2 Appendix U p.124)
The SAP/EI rating is computed on UK-average weather (Appendix U Tables
U1-U3 region 0) so ratings are nationally comparable, but Appendix U
paragraph 1 (PDF p.124) requires that "other calculations (such as for
energy use and costs on EPCs) are done using local weather. Weather data
for each postcode district are taken from the PCDB". `Sap10Calculator.
calculate` ran ONE cascade (UK-average) and fed it to SAP, CO2 AND primary
energy, so every cert's EPC-displayed CO2/PE were computed on the wrong
climate. Because most of England is warmer than the UK-average, this
systematically OVER-counted heating demand on the emissions/PE outputs.

The two cascades (`cert_to_inputs` rating, `cert_to_demand_inputs`
postcode) already existed; this wires the demand cascade into the
production entry point and grafts its CO2/PE onto the rating result (SAP
unchanged). The corpus gauge's longstanding +5% CO2/PE over-estimate was
mostly this climate bug, NOT (as previously diagnosed) per-cert mapper
fidelity:
  CO2 MAE 0.26 -> 0.12 t/yr  (bias +0.18 -> +0.04)
  PE  MAE 13.6 -> 3.8 kWh/m2 (bias +9.0  -> +0.24)
  SAP within-0.5 = 69.7% (rating cascade, unchanged)

Worksheet-validated to 1e-4 on simulated case 45 (heat-pump ground-floor
flat, postcode W6): the P960 prints the current dwelling twice — Block 1
on UK-average weather (SAP 60.5318, CO2 692.13) and Block 2 on postcode
weather (CO2 626.78, PE 6581.59). Both reproduce exactly. Added a tracked
case-45 Summary fixture + two-cascade cascade pin as a permanent guard,
and ratcheted the corpus CO2/PE ceilings to 0.13 / 4.2. The e2e Elmhurst
suite (Block-1 line refs) now pins the rating cascade directly; the two
Vaillant overlay snapshots refreshed to demand-cascade CO2/PE.

pyright not installed in this codespace (strict gate not run locally);
change is type-trivial (dataclasses.replace over SapResult).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-18 14:15:34 +00:00
Jun-te Kim
f0411b2cf1 Merge https://github.com/Hestia-Homes/Model into feature/hyde_make_it_more_accurate_with_tests 2026-06-18 10:33:27 +00:00
Jun-te Kim
a955a09e9c Pin uprn_10093116330 (full-SAP gas-combi 2-storey semi): engine 82 vs Elmhurst 78
7th sibling full-SAP cert; documented full-SAP→RdSAP +4 residual. Build clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 10:26:11 +00:00
Jun-te Kim
d08c35ee03 Merge branch 'feature/hyde_make_it_more_accurate_with_tests' into feature/landlord-overrides 2026-06-17 09:43:28 +00:00
Jun-te Kim
e519de26a4 Pin uprn_10093116336 (full-SAP gas-combi 2-storey semi): engine 83 vs Elmhurst 79
6th sibling full-SAP cert; same documented full-SAP→RdSAP +4 residual. Build clean.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 09:41:29 +00:00
Jun-te Kim
317220beba Pin uprn_10093116334 (full-SAP gas-combi bungalow): engine 81 vs Elmhurst 77
5th sibling full-SAP cert validated against Elmhurst (semi-detached bungalow,
same street/boiler PCDB 17505 as 10093116324). Engine 81 (lodged 82); Elmhurst
worksheet 77. The +4 is the documented full-SAP→RdSAP residual. Build verified
clean (storeys=1, no phantom conservatory). Worklist strategy note added.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 09:32:57 +00:00
Jun-te Kim
f226570b0f upgraded python version 2026-06-17 09:28:24 +00:00
Jun-te Kim
3eb0022034 Pin uprn_10093116324 (full-SAP gas-combi bungalow): engine 79 vs Elmhurst 74
4th sibling full-SAP cert validated against Elmhurst. Engine 79 (lodged 80);
Elmhurst worksheet 74. The +5 is the documented full-SAP→RdSAP residual — engine
uses the cert's measured U-values (wall 0.19/floor 0.12/roof 0.12) + PCDB combi
17505 (88.5%); Elmhurst uses RdSAP band-L defaults + generic 84% BGW combi. Build
verified clean (single-storey bungalow, no phantom conservatory, TFA 52/51.9).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-17 09:10:36 +00:00
Jun-te Kim
d87718f316 Merge remote-tracking branch 'origin/main' into feature/hyde_make_it_more_accurate_with_tests
# Conflicts:
#	datatypes/epc/domain/mapper.py
2026-06-17 09:05:37 +00:00
Khalim Conn-Kowlessar
fe3bf4eaed fix(ventilation): read Blower Door AP50 pressure test (Summary)
SAP 10.2 §2 (17)-(18): a measured/design air permeability at 50 Pa from a
Blower Door test routes infiltration via `(18) = AP50/20 + (8)`, in
preference to the components-based (16) estimate. The Elmhurst extractor
read only the AP4 ("Pulse") column of §12.2, so a Blower Door result
(§12.2 "Pressure Test Result (AP50)") fell through to the structural-
infiltration default — over-counting ventilation heat loss.

Surfaced by simulated case 44 (AP50 4.50): effective air change rate was
0.81 vs the worksheet's 0.58 (+38% ventilation loss). The cascade already
supports `air_permeability_ap50` (preferred over AP4); this wires the read
end to end (extractor → ElmhurstSiteNotes → SapVentilation → cert_to_inputs).

Pinned against the case-44 P960 §2 at abs=1e-4: (18) infiltration 0.3417
(= 4.5/20 + 0.1167) and (25) Jan effective ach 0.5812. Worksheet harness
stays 47/47 0-raised.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 23:18:17 +00:00
Jun-te Kim
d41798f1c1 Add SAP-18.0.0 + SAP-16.0 schema coverage; autonomous-run triage findings
Schema coverage (datatypes/epc/domain/mapper.py):
- SAP-Schema-18.0.0: full-SAP shape ≡ 17.1 → from_sap_schema_17_1, no normalisation.
- SAP-Schema-16.0: same reduced-field 16.x path; default the omitted `tenure`
  field in _normalize_sap_schema_16_x (metadata; SAP cascade never reads it).
  Genuinely sparse 16.x certs (missing core fabric fields) still fail loud.
- Regression tests + sap_18_0_0.json / sap_16_0.json fixtures; 0 new pyright errors.

Autonomous triage of the worklist (scripts/hyde/autonomous_run_findings.md):
- Found + diagnosed 2 bugs (flagged, NOT fixed): (1) MAPPER — full-SAP openings
  lodged in mm read as m → multi-million-m2 windows → SAP clamps to 1 (uprn_
  10093117227 / 10090317693 / 10091636031); (2) CALCULATOR — database heat-pump
  fuel code 39 mis-priced as gas, over-rates ~14 (uprn_10093114053).
- Most certs map within +/-4 of lodged.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 20:04:43 +00:00
Jun-te Kim
0b32d9fcee Add SAP-16.3 + SAP-17.0 schema coverage (completes the e2e UPRN set)
- SAP-Schema-16.3: same reduced-field RdSAP shape as 16.2 — generalise the
  normaliser to _normalize_sap_schema_16_x and route both 16.2/16.3 through it.
  uprn_44012843 maps → SAP 79 (lodged 81).
- SAP-Schema-17.0: structurally identical to the full-SAP 17.1 schema (measured
  sap_opening_types), so it parses with the 17.1 dataclass and reuses
  from_sap_schema_17_1 with no normalisation. uprn_10023444324 → 80, uprn_
  10023444320 → 81.
- Regression tests (16.3 dispatch, 17.0 dispatch) + sap_16_3.json / sap_17_0.json
  fixtures; 0 new pyright errors. All 7 e2e UPRNs now map.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 19:24:50 +00:00
Jun-te Kim
36dc55b499 epc json 2026-06-16 18:54:07 +00:00
Jun-te Kim
4d14607e7e Add SAP-16.2 schema coverage + single-glazing fix; flat party-wall fix; pin 2 certs
SAP-Schema-16.2 (datatypes/epc/domain/mapper.py):
- 16.2 is structurally an RdSAP-17.1 cert under a different name; add
  _normalize_sap_schema_16_2 (field renames + defaults) and dispatch to the
  tested from_rdsap_schema_17_1 mapper. uprn_100020933699 maps → SAP 71.
- Honour a "Single glazed" windows description when multiple_glazing_type="ND"
  (was defaulting to double) → RdSAP-21 code 5; eng 72→71 (lodged 70).
- 4 regression tests + sap_16_2.json fixture; 0 new pyright errors.

Flat party-wall fix (domain/sap10_calculator/worksheet/heat_transmission.py):
- Full-SAP flats carry flatness in dwelling_type, not property_type, so the
  party-wall default fell through to the 0.25 house value instead of the RdSAP
  Table-15 flat 0.0. Add _is_flat_or_maisonette_dwelling fallback + regression
  test. uprn_10093116529 80→81 (matches the cert's lodged party u_value 0).

Accuracy corpus pins (tests/domain/sap10_calculator/test_real_cert_sap_accuracy.py):
- uprn_10093116543 (SAP-17.1 gas-combi semi): engine 81 (Elmhurst 77; documented
  full-SAP→RdSAP residual — measured wall/floor U + PCDB boiler vs RdSAP defaults).
- uprn_10093116529 (SAP-17.1 g/f flat): engine 81 (Elmhurst 78).

devcontainer: add poppler-utils (pdfinfo) for the documents-parser PDF fixtures.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 18:53:00 +00:00
Khalim Conn-Kowlessar
d4d2b222fc feat(conservatory): §6.1 fabric cascade (27/27a/28a + TFA/volume)
Wire the non-separated conservatory into the §3 heat-transmission +
§1 dimensions cascade per RdSAP 10 §6.1 (PDF p.49) + Table 25 (p.51):

  "The floor area and volume of a non-separated conservatory are added to
   the total floor area and volume of the dwelling. Its roof area is taken
   as its floor area divided by cos(20°), and wall area is taken as the
   product of its exposed perimeter and its height. ... The conservatory
   walls and roof are taken as fully glazed ... Glazed walls are taken as
   windows, glazed roof as rooflight."

New `worksheet/conservatory.py` derives the geometry:
  - height from the equivalent storey count (§6.1: 1 storey → ground-floor
    room height; 1½ → ground + 0.25 + 0.5×first; etc.);
  - glazed WALL → window (27) at Table 25 U (double 3.1 / single 4.8) with
    the §3.2 curtain resistance (R=0.04) → U_eff 2.758;
  - glazed ROOF → rooflight (27a) at Table 25 roof U (double 3.4 / single
    5.3) + curtain → U_eff 2.993;
  - FLOOR → (28a) via BS EN ISO 13370 as an uninsulated SOLID ground floor
    with 300 mm walls (§5.12, spec p.43), exposed perimeter = glazed
    perimeter → U 0.89;
  - glazed wall + roof + floor areas join (31)/(36); the fully-glazed
    structure walls/roof add nothing (the glazing IS the window/rooflight).

`dimensions_from_cert` adds the conservatory floor area to TFA (4) and
floor area × height to volume (5) (feeds ventilation (8)), without making
it a storey (avg storey height for §2 infiltration is unchanged).

Pinned against the simulated case-44 P960 §3 at abs=1e-4 — every line ref
EXACT: (4) 95.3800, (5) 257.1630, (27) 96.1169, (27a) 38.2201, (28a)
21.4164, (29a) 35.5852, (30) 7.4688, (31) 294.2900, (33) 207.3274,
(36) 23.5432. The remaining whole-dwelling SAP/CO2 gap is the §6 solar
gains, closed in the next slice. Worksheet harness stays 47/47 0-raised.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 15:59:26 +00:00
Khalim Conn-Kowlessar
fa131cca0b feat(conservatory): read §6.1 geometry through extractor + mapper
RdSAP 10 §6.1 (PDF p.49) models a non-separated (heated) conservatory as
part of the dwelling. Until now the Summary §5 block was reduced to an
inert `has_conservatory` bool and the geometry (floor area, glazed
perimeter, glazing, storey height) was dropped on both paths.

Plumbing only — no cascade consumer yet (Slices B/C/D wire §3/§6):
- ElmhurstSiteNotesExtractor reads the §5 Conservatory block into a new
  `Conservatory` site-notes record (scoped to §5 so the generic
  "Floor Area"/"Room Height" labels can't collide with §4 dimensions);
- domain gains a frozen `SapConservatory` (floor area, glazed perimeter,
  double/single glazing, thermally-separated guard, equivalent storey
  count) on `EpcPropertyData.sap_conservatory`;
- the Elmhurst mapper threads it through, dropping SEPARATED
  conservatories per §6.2 ("A separated conservatory ... is disregarded").

Verified against the simulated case-44 Summary (RefNo 001431): extracts
floor_area=12.0, glazed_perimeter=9.0, double_glazed=True, 1 storey.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 15:37:05 +00:00
Jun-te Kim
928fbbc33a Merge remote-tracking branch 'origin/main' into feature/hyde_make_it_more_accurate_with_tests
# Conflicts:
#	applications/sharepoint_renamer/handler.py
#	domain/sap10_calculator/worksheet/heat_transmission.py
2026-06-16 15:23:52 +00:00
Jun-te Kim
2f0eb49eee Checkpoint: UPRN 10093116543 Elmhurst build + devcontainer VNC/Playwright + perms
- Add SAP-accuracy sample for uprn_10093116543 (epc.json, elmhurst_inputs.md,
  summary/worksheet PDFs)
- Persist hyde viewer stack (xvfb/fluxbox/x11vnc/novnc/websockify) and Playwright
  chromium in the backend devcontainer; forward noVNC 6080
- Broaden .claude/settings.local.json allowlist (display/python/grep/tail)
- In-progress campaign mapper/cert_to_inputs work carried from prior cert

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-16 15:21:56 +00:00
Jun-te Kim
dad3044740 save skills and automation progress 2026-06-16 12:17:43 +00:00
Khalim Conn-Kowlessar
419e340477 test(worksheet): pin simulated case 43 at 1e-4 (RR + dry-line + mixed roof)
Golden regression fixture for the multi-feature dwelling that surfaced the
two Elmhurst-extractor bugs in a33707f8. case 43 is a 2-BP mid-terrace with
a DETAILED room-in-roof (two slopes, two flat ceilings, party + exposed
gables, two common walls), a MIXED-insulation multi-section roof (Main
insulated + Extension uninsulated 2.30), a DRY-LINED extension solid wall,
a mains-gas boiler (102 / control 2106) and a House-coal solid-fuel
secondary (633).

Routes the Summary PDF through the WHOLE extractor + mapper + calculator
pipeline (no hand-built EpcPropertyData) and pins the §3 fabric + SAP-rating
block at abs=1e-4: (29a) walls 74.5800, (30) roof 38.5008, (33) fabric
172.7844, continuous SAP 73.2332 = (258), CO2 3518.3037 = (272). Guards the
detailed-RR slope/common_wall surfaces, the dry-lining R=0.17 adjustment,
and the per-part mixed-roof billing together. Summary mirrored to
backend/documents_parser/tests/fixtures/Summary_001431_case43.pdf; provider
module mirrors the _case6/_case21 pattern, assertion in
test_section_cascade_pins. Harness 47/47; regression = the 3 pre-existing
fails; pyright net-zero.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 08:26:05 +00:00
Khalim Conn-Kowlessar
a33707f851 fix(elmhurst): read main-wall dry-lining + fix last-RR-row U over-read
Two compensating Summary-extractor bugs surfaced by simulated case 43 (a
2-BP mid-terrace with a detailed room-in-roof + a dry-lined extension wall).
Their fabric errors nearly cancelled (walls net −0.76 W/K), hiding both
behind a deceptively small +0.05 SAP delta.

Bug 1 — main/extension wall dry-lining never read. The §7 "Dry-lining:
Yes/No" line was parsed only for ALTERNATIVE walls; the main/extension
WallDetails dropped it, so a dry-lined solid wall was billed at its
un-adjusted base U. RdSAP 10 §5.8 + Table 14: a dry-lined uninsulated wall
adds R=0.17 → U = 1/(1/U_base + 0.17). Case 43 Ext1: solid brick 1.70 →
1.32. Added `WallDetails.dry_lined`, read it in the extractor (both the
main-wall builder and the As-Main copy), threaded it to the domain
`wall_dry_lined` (emit None when undried — cascade-equivalent to False,
keeps the field absent for the non-dry-lined majority).

Bug 2 — the LAST room-in-roof surface row's U over-read. The per-row token
scan stops at the next RIR-row name; the final surface (no successor) over-
read into the following section, shifting the trailing-token slotting and
silently zeroing its `default_u` (case 43 Common Wall 2: 1.90 → 0.00 → the
2.4 m² common wall billed at U=0 instead of the main-wall 1.90). Stop the
scan at the row's natural end — the "Yes"/"No" u_value_known flag plus the
trailing u_value numeric.

Case 43 now reproduces the P960 EXACTLY: (29a) walls 74.5800, (33) fabric
172.7844, continuous SAP 73.2332 = (258), CO2 3518.30 = (272), all <1e-4
(was SAP +0.0455 / CO2 −8.04). Harness 47/47 0 raised; regression = the 3
pre-existing fails; pyright net-zero (51=51).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 07:51:33 +00:00
Khalim Conn-Kowlessar
5d556faf86 fix(roof): bill at-rafters insulation on RdSAP 10 Table 16/18 column (2)
`u_roof` only implemented the joist column, so roofs lodged insulated at
rafters (`roof_insulation_location == 1`) were mis-billed at the joist U
on both the API and Summary paths — under-stating loss, over-rating SAP.

RdSAP 10 §5.11.2 Table 16 (spec p.42-43) gives a distinct "insulation at
rafters" column (2): the rafter cavity is shallower than a loft void, so
the same depth yields a higher U (200 mm: rafters 0.29 vs joists 0.21).
§5.11 Table 18 (p.45) likewise carries a rafters column (2) for unknown /
as-built thickness (footnote (1): "The value from the table applies for
unknown and as built") — band A-D = 2.30, E = 1.50, F = 0.68, diverging
from the joist column's 100 mm-equivalent 0.40 default (footnote (4)).

- add `_ROOF_RAFTERS_BY_THICKNESS` (Table 16 col 2) + `_ROOF_RAFTERS_BY_AGE`
  (Table 18 col 2) to rdsap_uvalues; `u_roof` selects them via a new
  `insulation_at_rafters` flag (ignored for flat / sloping-ceiling roofs).
- `heat_transmission` derives the flag PER BUILDING PART from
  `roof_insulation_location` (gov-API int 1 / Summary "R Rafters"), which
  also fixes the multi-part dedup-roof-join problem: each part's own
  location now drives its U, replacing the unattributable joined
  `epc.roofs[]` description.

Worksheet-validated to 1e-4: simulated case 41 (4-bp — Ext1 rafters 200mm
→ 0.29, Ext3 rafters As-Built band F → 0.68; roof total 24.8350) and case
42 (6 variants — rafters 50mm → 0.88, rafters unknown band C → 2.30,
joists/none unchanged). Case 40 stays exact (roof 35.340, total 441.1606);
worksheet harness 47/47.

Corpus within-0.5 66.9% → 66.5% (gates 0.65/1.08 hold) — a spec-correct
shift, NOT a regression: all 15 corpus rafter certs carry redacted (None)
thickness yet lodge roof EER 2-4 (insulated), so the open API blanked a
specified thickness and the spec's unknown-rafter 2.30 default correctly
over-states them. Recovery needs a roof-EER→thickness inference on the
API path (follow-up), not a change to the U-table.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-16 04:42:44 +00:00
Khalim Conn-Kowlessar
b2b6f8e954 fix(mapper): map Elmhurst "Value known" cylinder to measured volume (code 6)
The Elmhurst Summary §15.1 lodges "Cylinder Size: Value known" with the
measured volume in the "Cylinder Volume (l)" line — the Summary-path
equivalent of the gov-API "Exact" descriptor. The mapper had no entry for
"Value known" so `_elmhurst_cylinder_size_code` raised UnmappedElmhurstLabel,
and even once mapped the measured volume was never threaded through, so the
cascade dropped the cylinder storage loss (~468 kWh/yr) from (219) water
heating on every measured-volume-cylinder Summary.

Per RdSAP 10 §10.5 Table 28 (p.55) a measured cylinder volume is used
directly. Map "Value known" → cascade code 6 (Exact) and thread the §15.1
"Cylinder Volume (l)" value into SapHeating.cylinder_volume_measured_l, which
`_cylinder_volume_l_from_code` (cert_to_inputs.py:5281) already reads for
code 6 — mirroring the gov-API path (mapper.py:1575/1885).

Pins simulated case 39 (P960-0001-001431): an age-A mid-terrace on direct-
acting electric room heaters (SAP code 691, cat 10, control 2602) with
electric-immersion DHW off a 117 L "Value known" cylinder. The full
extractor→mapper→calculator cascade now reproduces the worksheet's SAP-rating
block EXACTLY — SAP value 36.6365 (band F) and (272) CO2 2056.0731 kg/yr,
with (219) water heating 2637.5049 and (255) total energy cost 1802.0039.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 23:57:25 +00:00
Jun-te Kim
5c11fd35c8 Validate SAP calculator vs Elmhurst; fix reduced-field window U; add accuracy harness
Reduced-field window U: heat_transmission derived the synthesised-window
raw U from u_window(all None) -> the 2.5 placeholder regardless of glazing.
Now routes the (uniform) glazing_type code through u_window (RdSAP Table 24)
so e.g. double pre-2002 reads 2.8, not 2.5. Only the pre-SAP10 reduced-field
path is affected (21.0.1 certs carry per-window U upstream) — the RdSAP-21.0.1
corpus gauge is unchanged at 66.9% within-0.5.

test_real_cert_sap_accuracy: pin uprn_10002468137 (RdSAP-17.1, all-electric
storage heaters) at SAP 61, validated against Elmhurst on identical inputs
(dual off-peak immersion, 110 L cylinder, 2 baths). Our engine reproduces
Elmhurst's fuel cost to the penny; lodged 55 is the old SAP-2012 schema.

Tooling to grow the accuracy corpus:
- scripts/fetch_real_life_epc_sample.py — capture a cert by UPRN into the corpus.
- scripts/compare_epc_paths.py — diff gov-API vs Elmhurst-summary EpcPropertyData
  and run both through the engine, localising mapper vs calculator differences.
- skill validate-cert-sap-accuracy — the end-to-end loop (capture -> Elmhurst
  inputs -> human builds -> compare -> reconcile -> pin in the test).
- skill epc-to-elmhurst-rdsap-inputs reference: corrected immersion (code 1=dual),
  cylinder size (code 2 = Normal/110 L), and bath-count (WWHRS sub-tab) mappings.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 15:26:11 +00:00
Jun-te Kim
125ff6f4dd Merge remote-tracking branch 'origin/main' into feature/hyde_make_it_more_accurate_with_tests
# Conflicts:
#	datatypes/epc/domain/mapper.py
2026-06-15 14:12:38 +00:00
Khalim Conn-Kowlessar
4fdc23f83d test(worksheet): pin simulated case 38 — mains-gas secondary reproduces worksheet exactly
The realistic re-generation of case 37 (code-117 gas boiler, control 2102,
+ a MAINS-GAS condensing gas-fire secondary code 611, vs case 37's biogas
605). The full extractor -> mapper -> calculator pipeline reproduces the
worksheet's SAP-rating block EXACTLY: continuous SAP 60.9152 (Δ 2e-5) and
(272) CO2 5801.0770 (Δ ~0). This confirms the boiler-efficiency /
control-2102 −5pp interlock / secondary-fuel handling are all correct, and
that case 37's +7 gap was purely the biogas sub-fuel the Summary export
cannot carry.

Summary mirrored into backend/documents_parser/tests/fixtures so the pin
runs without the unstaged workspace. PE not pinned — it is a separate
DPER block (different scope) already guarded by the corpus PE gauge.
Worksheet harness 47/47 unchanged; pyright net-zero.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-15 13:31:36 +00:00
Jun-te Kim
0079752eab inviestigation with hyde values 2026-06-15 12:13:11 +00:00
Khalim Conn-Kowlessar
5b2cf5edc7 Merge remote-tracking branch 'origin/main' into feature/per-cert-mapper-validation
# Conflicts:
#	datatypes/epc/domain/epc_property_data.py
#	datatypes/epc/domain/mapper.py
#	datatypes/epc/domain/tests/test_from_rdsap_schema.py
2026-06-13 22:20:15 +00:00
Jun-te Kim
1f40c3aeef fix engine dockerfile 2026-06-12 16:07:39 +00:00
Jun-te Kim
0159176772 python upgraded due to enum 2026-06-12 15:47:28 +00:00
Jun-te Kim
80ccec9b68 added floats helper 2026-06-12 14:28:41 +00:00
Jun-te Kim
a6123d762c Merge branch 'main' of https://github.com/Hestia-Homes/Model into feature/junte+khalim 2026-06-12 13:45:30 +00:00
Jun-te Kim
ff4a2e4242
Merge pull request #1198 from Hestia-Homes/feature/bill-derivation
Feature/bill derivation
2026-06-12 14:44:30 +01:00
Jun-te Kim
3995433816 Map RdSAP-Schema-17.0 certs to EpcPropertyData 🟥
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 12:40:04 +00:00
Jun-te Kim
5178197dc2 Map RdSAP-Schema-19.0 certs to EpcPropertyData 🟥
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-12 12:19:16 +00:00
Jun-te Kim
cfc337f04a Dispatch and map RdSAP-Schema-18.0 certs end-to-end 🟥
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-11 11:12:53 +00:00