diff --git a/docs/HANDOVER_API_PROFILING.md b/docs/HANDOVER_API_PROFILING.md index c31f2915..930bd71b 100644 --- a/docs/HANDOVER_API_PROFILING.md +++ b/docs/HANDOVER_API_PROFILING.md @@ -381,6 +381,39 @@ bug, not noise): on unsupported schema 19.1.0; type 4 already accurate). `shower_wwhrs` 1/2/3/4 = none / inst- WWHRS-1 / inst-WWHRS-2 / storage. Low headline value — not worth pursuing. +### SESSION-9 — profile sweep, six candidates ruled out (no shipped fix) +Re-ran `profile_api_error.py --min-n 10` at HEAD `da094feb` and chased every +biased/error-carrying bucket field-by-field. All resolved to proxy / already-deproven / +already-fixed — **no clean systematic bug found at this resolution**: +- **roof_construction code 5/8 (vaulted/sloping) under-rate** (gas: `5`−0.67, `4,5`−0.59, + `4,8`−0.85): a PROXY. `u_roof` returns the SAME value for code 4 and code 5 at a given + age/thickness (band G ND → 0.40 both; verified in `rdsap_uvalues.u_roof`). Code-5 certs + correlate with electric HW / flats / multi-part dwellings — those are the cause, not the roof. +- **age_band dropped on the roof path:** NOT a bug. Doc-level `construction_age_band` is None for + ALL 909 certs (age lives per-building-part); the mapper reads the bp-level band correctly + (`heat_transmission.py:735 part.construction_age_band`; cert 2270 → bp0 age=G mapped fine). +- **roof description "(same dwelling above)"** (a heated dwelling above → should be zero-loss + internal element): only 5 certs carry it and 4 are already within 0.5. Cert 0700 (−6.37) is a + messy 4-building-part data-fidelity cert (mixed Flat-no-insulation + Roof-room parts), not a + "same dwelling above" bug. Not systematic. +- **index-less MEV gas residual (the post-e6dda705 "next lead"):** RESOLVED by e6dda705. The 17 + index-less-MEV gas certs are now signed +0.09 / median +0.43 / 41% within-0.5 — centred scatter, + not a systematic over-rate. The ~6 certs at +1.0..+1.9 are per-cert, no common cause. +- **wit=4 (as-built cavity) gas −0.25 over 478 certs:** tail-driven, not a uniform Table-6 shift. + Splitting by age band, the worst bands (B −0.54, F −0.60, G −1.00) still hold 70–77% within-0.5 + — a few big-negative outliers drag the mean; correcting a Table-6 value would over-shift the + majority that are already accurate. +- **community-main + whc=903 electric-immersion HW under-cost (−6.3, n=3):** = the deproven + **meter_type=3** data-fidelity issue. All three (0380/2270/2673) carry meter_type=3; the HW + electricity tariff ambiguity (lodged HW ≈17.3 p/kWh vs our standard ≈22.36 p) is exactly the + Unknown-meter artifact already on this list. +- **The biggest +32 outlier 2958** (fuel=0 / sapcode=699): per-cert, NOT a class bug — cert 3420 + has an identical heating profile and sits at +0.18. 2958's error is fabric/geometry-specific. +Method note: `decompose_api_cost_error.py` cluster table = heat:high 311 (47% within) / +heat:low 206 / hw:low 173 / hw:high 125 / balanced 94 — i.e. the residual is a broad per-cert +fabric+HW tail, not one lever. The clean systematic wins (sheltered walls, MEV, fuel collisions) +are harvested; remaining headway is per-cert worksheet grind or the 100-cert schema big-ticket. + ## THE 100 unsupported_schema CERTS (deferred — bigger ticket) SAP-Schema-19.1.0 (and other pre-21). The user is planning a separate big piece: map old schemas → new + **predict missing fields from similar-looking properties** (needs an EPC-prediction