docs: session-9 sweep — six API-error candidates ruled out (no shipped fix)

Profile-driven re-sweep at HEAD da094feb. Every biased/error-carrying bucket
chased field-by-field resolved to proxy / already-deproven / already-fixed:
roof code 5/8 (same u_roof as code 4), per-bp age mapping (correct),
'(same dwelling above)' roofs (5 certs, 4 fine), index-less MEV gas
(centred by e6dda705 to signed +0.09), wit=4 cavity -0.25 (tail-driven),
community whc=903 HW (= deproven meter_type=3). The +32 outlier 2958 is
per-cert (twin 3420 is +0.18). Residual is a broad per-cert fabric+HW tail.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
Khalim Conn-Kowlessar 2026-06-09 09:24:40 +00:00
parent da094feb62
commit ddb9fdbec5

View file

@ -381,6 +381,39 @@ bug, not noise):
on unsupported schema 19.1.0; type 4 already accurate). `shower_wwhrs` 1/2/3/4 = none / inst-
WWHRS-1 / inst-WWHRS-2 / storage. Low headline value — not worth pursuing.
### SESSION-9 — profile sweep, six candidates ruled out (no shipped fix)
Re-ran `profile_api_error.py --min-n 10` at HEAD `da094feb` and chased every
biased/error-carrying bucket field-by-field. All resolved to proxy / already-deproven /
already-fixed — **no clean systematic bug found at this resolution**:
- **roof_construction code 5/8 (vaulted/sloping) under-rate** (gas: `5`0.67, `4,5`0.59,
`4,8`0.85): a PROXY. `u_roof` returns the SAME value for code 4 and code 5 at a given
age/thickness (band G ND → 0.40 both; verified in `rdsap_uvalues.u_roof`). Code-5 certs
correlate with electric HW / flats / multi-part dwellings — those are the cause, not the roof.
- **age_band dropped on the roof path:** NOT a bug. Doc-level `construction_age_band` is None for
ALL 909 certs (age lives per-building-part); the mapper reads the bp-level band correctly
(`heat_transmission.py:735 part.construction_age_band`; cert 2270 → bp0 age=G mapped fine).
- **roof description "(same dwelling above)"** (a heated dwelling above → should be zero-loss
internal element): only 5 certs carry it and 4 are already within 0.5. Cert 0700 (6.37) is a
messy 4-building-part data-fidelity cert (mixed Flat-no-insulation + Roof-room parts), not a
"same dwelling above" bug. Not systematic.
- **index-less MEV gas residual (the post-e6dda705 "next lead"):** RESOLVED by e6dda705. The 17
index-less-MEV gas certs are now signed +0.09 / median +0.43 / 41% within-0.5 — centred scatter,
not a systematic over-rate. The ~6 certs at +1.0..+1.9 are per-cert, no common cause.
- **wit=4 (as-built cavity) gas 0.25 over 478 certs:** tail-driven, not a uniform Table-6 shift.
Splitting by age band, the worst bands (B 0.54, F 0.60, G 1.00) still hold 7077% within-0.5
— a few big-negative outliers drag the mean; correcting a Table-6 value would over-shift the
majority that are already accurate.
- **community-main + whc=903 electric-immersion HW under-cost (6.3, n=3):** = the deproven
**meter_type=3** data-fidelity issue. All three (0380/2270/2673) carry meter_type=3; the HW
electricity tariff ambiguity (lodged HW ≈17.3 p/kWh vs our standard ≈22.36 p) is exactly the
Unknown-meter artifact already on this list.
- **The biggest +32 outlier 2958** (fuel=0 / sapcode=699): per-cert, NOT a class bug — cert 3420
has an identical heating profile and sits at +0.18. 2958's error is fabric/geometry-specific.
Method note: `decompose_api_cost_error.py` cluster table = heat:high 311 (47% within) /
heat:low 206 / hw:low 173 / hw:high 125 / balanced 94 — i.e. the residual is a broad per-cert
fabric+HW tail, not one lever. The clean systematic wins (sheltered walls, MEV, fuel collisions)
are harvested; remaining headway is per-cert worksheet grind or the 100-cert schema big-ticket.
## THE 100 unsupported_schema CERTS (deferred — bigger ticket)
SAP-Schema-19.1.0 (and other pre-21). The user is planning a separate big piece: map old schemas
→ new + **predict missing fields from similar-looking properties** (needs an EPC-prediction