diff --git a/.claude/skills/epc-to-elmhurst-rdsap-inputs/reference/mapping.md b/.claude/skills/epc-to-elmhurst-rdsap-inputs/reference/mapping.md index 1aab0298..245b357a 100644 --- a/.claude/skills/epc-to-elmhurst-rdsap-inputs/reference/mapping.md +++ b/.claude/skills/epc-to-elmhurst-rdsap-inputs/reference/mapping.md @@ -180,12 +180,23 @@ Table 32 unit costs, p/kWh (`domain/sap10_calculator/tables/table_32.py`): **`main_fuel_type` / `water_heating_fuel` 29 = off-peak (7-hour) electricity** → Elmhurst Electricity meter type = **Dual-rate / Economy 7 (7-hour)**, NOT Single. -⚠️ **Known over-rating bug:** the engine prices **100% of off-peak space heating -AND hot water at the 5.50p low rate** (`inputs.space_heating_fuel_cost_gbp_per_kwh` -= 0.055), instead of the SAP **Table 12a high/low split** (a portion at the 15.29p -high rate). This under-costs all-electric Economy-7 dwellings and inflates the SAP -score. Always surface this in the output's "Known divergences". Canonical case: -UPRN 10002468137 — lodged 55, engine 62. +✅ **Economy-7 high/low split — FIXED (PR #1217).** The engine now applies the SAP +**Table 12a Grid 1** (space) + **Table 13** (immersion DHW) high/low split rather +than pricing 100% at the 5.50p low rate. Electric STORAGE heaters legitimately get +a 0.00 SH high-rate fraction (100% low — spec value, not a bug); immersion HW takes +the cylinder-volume/occupancy/single-dual Table 13 blend (applied in +`cert_to_inputs._hot_water_fuel_cost_gbp_per_kwh` when volume + occupancy + +single/dual are all resolved; absent any of them it still falls back to 100% low — +a rarer edge). Verified: canonical UPRN 10002468137 engine 60.92 = Elmhurst 61 to +the penny (its lodged 55 is the OLD SAP-2012 schema, not comparable); UPRN +10022893721 engine 79 = lodged 79, Elmhurst (Dual meter) 81. + +⚠️ **When building in Elmhurst you MUST set the Economy-7 meter** (`main_fuel_type` +/ `water_heating_fuel` 29 = off-peak 7-hour → Electricity meter type **Dual**, NOT +Single). Elmhurst silently defaults to Single/Standard and prices at the 13.19p +standard rate, collapsing the worksheet SAP ~13 points — which can masquerade as an +engine "over-rating". The control is a hidden Meters sub-tab on the SpaceHeating +page (`TabPanelMeters_RadioButtonListElectricityType`). ## Water Heating diff --git a/.claude/skills/expand-sap-accuracy-corpus/SKILL.md b/.claude/skills/expand-sap-accuracy-corpus/SKILL.md index 9c2d4062..1d12ed5b 100644 --- a/.claude/skills/expand-sap-accuracy-corpus/SKILL.md +++ b/.claude/skills/expand-sap-accuracy-corpus/SKILL.md @@ -19,6 +19,74 @@ score the property would get unchanged. Elmhurst is the accredited ground truth; its Input Summary (parsed back to `EpcPropertyData`) exposes mapper holes, and its worksheet exposes calculator holes. +## ⏩ Resume in a fresh context (autonomous run) + +If the user says "continue" / "next" / "keep going through the worklist", run this +loop **continuously without asking between certs** — report only `eng X / elm Y` +per cert and tick the worklist line as each is pinned. The build automation is +fully working (see `scripts/hyde/elmhurst_lib.py` helpers + the `build_.py` +templates); most certs build unattended end-to-end. + +**State (2026-06-19):** pinned this campaign — SAP-17.1 cohort + RdSAP-17.1/18.0 +(older) plus: **16.1** `100021943298` (76/75), **19.1.0** `10096028301` (82/82), +**16.3** `44012843` (79/78), **17.0** `10023444324` (80/80) + `10023444320` +(81/82), **RdSAP-20.0.0** `10090844932` (78/77), **16.2** `100090182288` (69/71, +semi house). Latest run (2026-06-19): **16.2** `100021985993` (74/72, end-terrace +bungalow), **17.1** `10091568921` (82/80, full-SAP end-terrace house combi 17615) ++ `10093718424` (81/80, semi sibling), **RdSAP-18.0** `10022893721` (79/81, first +NON-BOILER cert — electric storage heaters + immersion; storage-heater automation +now SOLVED incl. the Economy-7 Dual-meter step, see banked findings; engine 79 = +lodged, NO bug), **RdSAP-21.0.1** `10023443426` (76/79, native schema, combi house; +engine 76 = lodged EXACTLY; Elmhurst +3 = omitted-secondary build gap). Next `[ ]`: +`10093412452` (SAP-17.1), then `10090343335` (17.0) / `10093115480` (17.1) / +`68151071` (RdSAP-17.0). Skip `100020933699` (user said skip), `[⛔]` (NOT +MAPPABLE), `[⚠]` (flagged engine bugs: MVHR / heat-pump fuel-39). + +**Per-UPRN recipe** (all commands `DISPLAY=:99`, cwd `scripts/hyde`; run +`bash scripts/hyde/start_viewer.sh` once; creds in `.elmhurst-creds.json`; shared +assessment GUID `B44A0DB4-4C08-4241-B818-86F060172105`): +1. `PYTHONPATH=/workspaces/model python scripts/fetch_real_life_epc_sample.py ` + then scope: dwelling_type/built_form, age band, walls/roof/floor descriptions, + heating `main_heating_index_number`/category/`has_hot_water_cylinder`, window + total area (`sum(sap_windows w*h)`), party_wall_length, lighting %, MEV/AP50. +2. **Copy the closest `build_.py` template** and adjust values: + - combi flat → `build_100021943298.py`; regular-boiler+cylinder flat → + `build_44012843.py`; full-SAP combi flat (MEV+AP50) → `build_10096028301.py` + / `build_10023444324.py` (+party wall) / `build_10023444320.py` (mid-floor); + - combi house → `build_10090844932.py` (end-terrace, party wall) / + `build_100090182288.py` (semi, no party wall). + Adjust: property type/built-form, band (`_pick` by year, e.g. "1950"/"2012"/ + "2023"), two-floor dims + party wall, wall insulation, roof, floor, window m², + doors, lighting, boiler PCDB ref + search query, MEV/AP50 if present. +3. Run pages: `for p in property_description [flats] dimensions walls roofs floors + openings ventilation; do … build_.py $p; done` (one Save&Close each, + ~1 min/page; flats only for Flat property type). Then a window-verify/fix + snippet (re-add the combined window if the grid shows 0.00), then + `build_.py space_heating` and `water_heating`. +4. Heating uses the `elmhurst_lib` helpers: `E.select_boiler(page, "", + "")` (look up id/type in `domain/elmhurst/pcdb_gas_oil_boiler_codes.csv`; + the lodged `main_heating_index_number` IS the id); control + `E.set_heating_dialog(page, "…ButtonMainHeatingControls", "^Boilers", + "^Standard", "CBE Programmer, room thermostat and TRVs")` (=2106); water + `E.set_heating_dialog(page, "…ButtonWaterHeatingCode", "From Space Heating", + "From the primary heating system")`; combi → `E.clear_hot_water_cylinder(page)`. +5. Download: edit `elmhurst_download.py` `SAMPLE_DIR` to the cert's + `/uprn_` dir; first confirm Recommendations is clean (parse + `[id*=ContentPlaceHolder1] a` link text — Summary silently redirects to Address + until zero errors); `python scripts/hyde/elmhurst_download.py` (retry once; the + nav goes Address→Recommendations→Summary). +6. `PYTHONPATH=/workspaces/model python scripts/compare_epc_paths.py ` → + read **"gov-API inputs → SAP"** (engine) and **"Elmhurst's OWN engine + (worksheet …)"** (Elmhurst ground truth). Target ≤0.5–1. +7. **Pin** the engine value: add a `RealCertExpectation(schema, sample=uprn_, + cert_num, sap_score=)` in `tests/.../test_real_cert_sap_accuracy.py`; + run `…::test_real_cert_sap_score`. +8. Tick the worklist line `[x] … · eng X / elm Y (lodged Z) · PINNED …`. Next cert. + +See **Banked findings** below for the modal-dialog mechanics (all already encoded +in the helpers). New schema not mappable → add a dedicated `from_*_schema_*` +mapper first (per-schema convention) + guard with the RdSAP-21.0.1 corpus gauge. + ## The loop (one UPRN) 1. **Pick** the first `[ ]` UPRN in [worklist.md](worklist.md). @@ -93,6 +161,112 @@ Pattern: `with E.session() as (ctx,page): E.goto(...); E.set_text/set_select(... lines 17/18). This drove the first campaign mapper fix — see Banked findings. ## Banked findings (fold new ones in here as the corpus grows) +- **MAPPER GAP — cylinder insulation thickness dropped (RdSAP-17.0+):** the mapper + carries `cylinder_size` + `cylinder_insulation_type` but NOT + `cylinder_insulation_thickness` → `EpcPropertyData.sap_heating.cylinder_ + insulation_thickness_mm` stays None even when the cert lodges it (e.g. 50 mm). + The engine then assumes a poorly-insulated cylinder → over-counts HW cylinder + loss → under-rates. Confirmed uprn_68151071 (raw 50 mm → mapped None; engine HW + 3446 vs Elmhurst 2911 kWh; engine 68 vs lodged 70 / Elmhurst 71). FIX: map the + thickness in the RdSAP per-schema mapper; check blast radius (any pinned cylinder + cert may shift) + regress the RdSAP-21.0.1 corpus gauge. Leverage point — likely + improves every cylinder-with-lodged-thickness cert. Flagged, not yet fixed. +- **Party-wall type `_pick` gotcha:** matching `"filled"` ALSO matches "Cavity + masonry **UN**filled" (CU, U≈0.5) — the wrong type for a cert whose party wall is + U≈0. Match `"masonry filled"` to hit CF (Cavity masonry filled, U≈0). Affects + cavity builds (10090844932 / 10091568921 / 10093718424 / 10093412452 used the + loose `"filled"` and may have got CU in Elmhurst — only the Elmhurst cross-check, + not the pinned engine value which is validated against lodged). Fixed for + uprn_10093115480. +- **Shared-assessment reset: storage/electric → boiler cert.** The shared + assessment carries the PRIOR cert's heating system. Going storage→combi, the + boiler search dialog won't open while a SAP-table `MainHeatingCode` (e.g. SEB) is + set. Fix: JS-clear `MH1.TextBoxMainHeatingCode` to `''` + dispatch change, Save & + Close, then `select_boiler` works. Also RESET the electricity meter on the Meters + sub-tab (a prior off-peak cert leaves it Dual; gas certs want Single) and the + SECONDARY heating (a prior cert's secondary persists even when the calc shows + presence=No — set it explicitly or it pollutes the worksheet). +- **Secondary heating must be built in Elmhurst when lodged.** Certs lodge a + secondary (`sap_heating.secondary_heating_type`, e.g. 612 = mains-gas room heater + @ Table 4a seasonal efficiency 0.20 — a low-eff decorative/old gas fire). The + engine models it (fraction 0.1 from Table 11 ÷ the secondary efficiency → e.g. + 3065 kWh for code 612); omitting it in Elmhurst inflates the worksheet (uprn_ + 10023443426: omitted → 79 vs engine/lodged 76). Build via Secondary present=Yes → + `ButtonSecondaryHeatingCode` cascade (title "Select secondary heating"): fuel → + sub-fuel → appliance → type, e.g. Gas → Mains gas → Room Heaters → RGx (pick the + RGx whose efficiency matches the lodged code, ~0.20 = decorative/old gas fire). + ⚠ For a NATIVE RdSAP-21.0.1 cert, engine = lodged (exact, all components) is the + authoritative validation — if the Elmhurst rebuild diverges, suspect an omitted + lodged feature (secondary / meter), confirm via engine-on-Elmhurst-inputs ≈ + worksheet, and pin the engine = lodged value. + ⚠ **`present=No` does NOT clear the shared assessment's leftover secondary** — the + prior cert's secondary SYSTEM (fuel + SAP code) persists server-side even when the + UI dropdown reads "No", and it silently re-enters the worksheet. On a storage-heater + cert (uprn_100020665611, RdSAP-20.0.0 end-terrace house) a leftover House-Coal + closed-room-heater (SAP 633, cheap solid fuel) inflated the Elmhurst worksheet to + 44 vs engine 36 / lodged 37 until OVERWRITTEN. Always set the secondary EXPLICITLY + to the lodged appliance: storage-heater certs lodge "Portable electric heaters + (assumed)" → present=Yes → `ButtonSecondaryHeatingCode` cascade Electric → Electric + → Room Heaters → "REA Panel, convector or radiant heaters" (= SAP 691, eff 100%). + That dropped the worksheet 44 → 35 ≡ engine-on-Elmhurst-inputs 35 (faithful). +- **Openings: standard double/triple-glazing bands REQUIRE Frame Type + Glazing Gap.** + The window grid's `DropDownListExtFrameType` (PVC/Wood/Metal) and + `DropDownListExtGlazingGap` (6 mm / 12 mm / 16 mm or more) are required for any + band other than the new-build "Double post or during 2022" default — leaving them + empty fails the Recommendations gate ("Openings: Frame Type / Glazing Gap must be + entered"). Set both in `_add_window` BEFORE clicking Add (cert `pvc_window_frames` + → PVC; `glazing_gap` mm → the band). Glazing bands available: Single / Double|Triple + {pre 2002, between 2002 and 2021, post or during 2022, with unknown install date} / + Secondary / *…known data*. For RdSAP double-glazing with no lodged install year + (engine U 2.8) pick "Double pre 2002" (~U3.1, sub-1-SAP diff) — the "…known data" + options demand per-row U/g values. +- **Non-boiler (storage-heater) main heating — SOLVED** (uprn_10022893721, RdSAP- + 18.0 GF flat, electric storage heaters + immersion). The SpaceHeating page has NO + inline system-type selector and a `ButtonMainHeatingCode` button only APPEARS + once the bound PCDF boiler is cleared. Two-pass recipe (one Save & Close each): + 1. **Clear the leftover PCDB boiler**: set `MH1.TextBoxPCDFBoilerReference` to + `"0"` via JS + dispatch `change`, then `save_close`. (It doesn't AutoPostBack; + the Save commits it. After reload, boilerRef="0", the boiler fuel/flue fields + vanish, and `MH1.ButtonMainHeatingCode` is now present.) + 2. **Select the SAP-table system** via `set_heating_dialog(..ButtonMainHeating + Code, "^Electric","^Electric","Storage","SEB Modern slimline")` (title "Select + heating code"; L1 Gas/Oil/Solid Fuel/Electric/Community/No heating; storage L4 + = SEA old large-volume / SEB modern slimline / SED fan / SEJ integrated / SEK + high-heat-retention). **Match the cert's `sap_main_heating_code`**: 402 = SEB. + Then the CONTROL via `set_heating_dialog(..ButtonMainHeatingControls, + "Storage Radiator","CSA Manual charge control")` (= SAP 2401). Secondary No. + Water: `set_heating_dialog(..ButtonWaterHeatingCode, "Water Heater","^Electric", + "Immersion")` (→ code HEI) — then the immersion code REQUIRES a cylinder: CHECK + `WH.CheckBoxHotWaterCylinder` (JS click+change, AutoPostBacks) or Recommendations + errors "must have a Hot Water Cylinder"; set `WH.DropDownListCylinderSize` + (size 2→Normal/110L, 3→Medium/160L, 4→Large/210L), `WH.DropDownListInsulated` + (Foam/Jacket), `WH.DropDownListInsulationThickness`, and `WH.RadioButton + ListImmersionHeater` (off-peak meter → Dual). See `build_10022893721.py` as the + template. ⚠ **CRITICAL — set the electricity meter type.** All-electric off-peak + certs MUST set the Economy-7 meter or Elmhurst silently defaults to Single / + Standard and prices everything at the 13.19p standard rate (worksheet collapses + ~13 SAP). The control is a HIDDEN sub-tab on the SpaceHeating page: + `E.click_tab(page,"TabContainer_TabPanelMeters")` then `E.set_select(page, + "TabContainer_TabPanelMeters_RadioButtonListElectricityType","Dual")` (options + Single/Dual/18 Hour/24 Hour/Unknown; cert `meter_type` 1→Dual=7-hour off-peak). + Verify in Elmhurst summary §14.2 "Electricity meter type" / worksheet + "Electricity Tariff" before trusting the worksheet SAP. +- **Economy-7 off-peak pricing is CORRECT — do NOT "fix" it** (was a real bug, FIXED + in PR #1217 via the Table 13 off-peak water-heating split + window-U fix). The + engine applies SAP Table 12a Grid 1 (space) + Table 13 (immersion DHW) high/low + splits properly: storage heaters' SH high-rate fraction is legitimately 0.00 + (100% low rate), immersion HW takes the volume/occupancy/single-dual blend. Proof: + uprn_10002468137 (canonical) engine 60.92 = Elmhurst 61 to the penny; + uprn_10022893721 engine 79 = lodged 79, Elmhurst (Dual meter) 81. ⚠ If you see a + big engine-over-Elmhurst gap on an all-electric off-peak cert, SUSPECT THE BUILD + (Elmhurst meter left on Single — see meter step above), not the calculator. The + `reference/mapping.md` "known over-rating bug (engine 62)" note is STALE (pre-PR + #1217). `cert_to_inputs.py` `_hot_water_fuel_cost_gbp_per_kwh` applies the immersion + Table 13 blend when cylinder volume + occupancy resolve; when `immersion_heating_ + type` is UNLODGED on an off-peak meter it now defaults to DUAL per RdSAP §10.5 + (was a 100%-low-rate fallback that under-costs — fixed, +regression test + `test_off_peak_immersion_unlodged_type_defaults_to_dual_table13_blend`). Only an + unresolvable cylinder volume / occupancy still falls back to 100% low. - **Air-permeability AP50 fix** (uprn_10093116528, the first campaign cert): the full-SAP mapper was routing the lodged q50 to the engine's AP4/Pulse formula (`0.263×AP4^0.924`) instead of the AP50 `/20` path — a big infiltration @@ -105,13 +279,78 @@ Pattern: `with E.session() as (ctx,page): E.goto(...); E.set_text/set_select(... default. Pin the engine's observed value; document the Elmhurst delta (don't tune). - **FGHRS** (full-SAP `has_fghrs`/`fghrs_index_number`) is dropped by the mapper and not yet modelled — omit it on BOTH sides so the comparison stays clean. +- **Validation errors live on the Recommendations page as LINKS** (red ✗ anchors), + not coloured spans — parse `[id*=ContentPlaceHolder1] a` text, not CSS colour. + Until Recommendations shows ZERO errors the **Energy Report Summary nav silently + redirects to the Address page** and the worksheet/Results PDFs won't generate + (Results renders empty). Two errors that bit the 16.1/19.1.0 builds: *"Walls + (Main): Insulation Thickness must be entered"* (set the insulation-thickness + dropdown — "Unknown" is fine for a reduced cert) and *"Incorrect Controls (NNNN + ctrl-group-X) for Heating System (idx ctrl-group-Y)"* (the heating control's + system-type must match the boiler's — see the controls dialog below). +- **All Elmhurst modal dialogs sit above a `modalBackground` that defeats element + clicks** (Playwright sees it "intercepting pointer events", even with force). + The cracked pattern is now in `elmhurst_lib.py`: open via JS click, set cascade + ` at FP+suffix — lets a + sweep enumerate the *real* Elmhurst options instead of hard-coding label + strings that drift between RdSAP releases. Skips the blank placeholder.""" + return page.evaluate( + """(id)=>{const s=document.getElementById(id); if(!s)return []; + return [...s.options].map(o=>[o.value,(o.text||'').trim()]) + .filter(([v,t])=>v!=='' && t!=='');}""", + f"{FP}{suffix}", + ) + + +def select_by_contains(page: Page, suffix: str, needle: str, autopostback: bool = True) -> Optional[str]: + """Select the option whose visible text contains `needle` (case-insensitive), + by its value. Returns the chosen text, or None if no option matched — robust + to label drift where the exact string ("CA Cavity" vs "Cavity (CA)") is + uncertain until seen live.""" + for value, text in select_options(page, suffix): + if needle.lower() in text.lower(): + set_select(page, suffix, value, autopostback) + return text + return None + + def save_close(page: Page) -> None: """Save & Close — commits the whole server session to the DB (so the data survives the next fresh-login run). The body button id-suffix avoids the @@ -226,3 +251,173 @@ def delete_first_window(page: Page) -> None: page.wait_for_timeout(200) if window_row_count(page) < before: break + + +# --- Heating dialogs (boiler search + cascade) --------------------------- +# Hard-won (campaign certs 100021943298 / 10096028301): every Elmhurst modal +# dialog (PCDF boiler search, the SelectHeatingDialog cascade for water-heating +# method and main-heating controls) sits ABOVE a `modalBackground` overlay that +# makes Playwright's actionability hit-test see the background "intercepting +# pointer events" — so element .click()/.fill()/.select_option() all fail even +# with force. The reliable pattern is: +# * open the dialog with a JS .click() on its button, +# * set