mirror of
https://github.com/Hestia-Homes/Model.git
synced 2026-06-30 13:10:47 +00:00
docs: session-5 handover — as-built cavity-U fix (48.6 → 52.1%)
Adds the cavity wall-U slice to the SESSION-5 block + headline table; records the by-age-band re-split method that surfaced the G/H spike. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
parent
2e466ed1e6
commit
898dcfda18
1 changed files with 34 additions and 14 deletions
|
|
@ -13,18 +13,35 @@ deproven approaches + the meter/shower data-fidelity findings), and the earlier
|
||||||
`energy_rating_current`. Headline gauge:
|
`energy_rating_current`. Headline gauge:
|
||||||
`PYTHONPATH=/workspaces/model python scripts/eval_api_sap_accuracy.py`.
|
`PYTHONPATH=/workspaces/model python scripts/eval_api_sap_accuracy.py`.
|
||||||
|
|
||||||
| metric | session-3 (`a8e5563a`) | session-4 (`faf29942`) | **session-5 (`43d4c67d`)** |
|
| metric | session-3 (`a8e5563a`) | session-4 (`faf29942`) | **session-5 (`2e466ed1`)** |
|
||||||
|--------|------------------|------------------|------------------|
|
|--------|------------------|------------------|------------------|
|
||||||
| **% \|err\| < 0.5** | 45.1% | 47.6% | **48.6%** |
|
| **% \|err\| < 0.5** | 45.1% | 47.6% | **52.1%** |
|
||||||
| % \|err\| < 1.0 | 59.4% | 62.6% | **63.8%** |
|
| % \|err\| < 1.0 | 59.4% | 62.6% | **67.2%** |
|
||||||
| % \|err\| < 2.0 | 77.7% | 79.6% | **79.9%** |
|
| % \|err\| < 2.0 | 77.7% | 79.6% | **80.7%** |
|
||||||
| mean \|err\| | 1.702 | 1.586 | **1.561** |
|
| mean \|err\| | 1.702 | 1.586 | **1.497** |
|
||||||
|
| median \|err\| | — | — | **0.475** |
|
||||||
| computed / raises | 909 / 0 | 909 / 0 | **909 / 0** |
|
| computed / raises | 909 / 0 | 909 / 0 | **909 / 0** |
|
||||||
| unsupported_schema | 100 (deferred) | 100 (deferred) | 100 (deferred) |
|
| unsupported_schema | 100 (deferred) | 100 (deferred) | 100 (deferred) |
|
||||||
|
|
||||||
## SESSION-5 UPDATE (HEAD `43d4c67d`) — whc=903 immersion off-peak HW closed
|
## SESSION-5 UPDATE (HEAD `2e466ed1`) — whc=903 immersion HW + as-built cavity-U both closed
|
||||||
|
|
||||||
**Shipped (47.6 → 48.6%):** one spec-grounded fix, the session-4 robust-sweep `whc=903`
|
**Shipped (47.6 → 52.1%, two spec-grounded fixes):**
|
||||||
|
|
||||||
|
**(2) `2e466ed1` as-built "insulated (assumed)" cavity → Cavity-as-built row, not Filled cavity
|
||||||
|
(48.6 → 52.1%, the bigger win).** Robust-sweep lead: `wall_desc="Cavity wall, as built, insulated
|
||||||
|
(assumed)"` median +0.26, n=145, but split by age band it was a CLEAN G/H signal (G +1.38 n37,
|
||||||
|
H +1.61 n18; I-L neutral). RdSAP 10 Table 6 (England, p.41) "Filled cavity" row carries the † footnote
|
||||||
|
("assumed as built") ONLY at bands I-M, where it equals "Cavity as built"; at A-H the filled row is a
|
||||||
|
GENUINE fill. An as-built cavity (type 4) must use "Cavity as built" at all bands (G/H 0.60 not 0.35).
|
||||||
|
This was the SAME latent A-H bug slice S0380.210 fixed for "partial insulation (assumed)" but left for
|
||||||
|
"insulated (assumed)" by a legacy convention. Retired `_cavity_described_as_filled`; genuine fills
|
||||||
|
(wall_insulation_type=2) still hit the filled row. Per-band confirmation: I-M unchanged, G/H corrected
|
||||||
|
exactly. Bucket within-0.5 47% → 66%; eval +32 net (36 improved, 4 regressed — offsetting-error
|
||||||
|
electric-storage flats). 3 tests updated to the corrected behaviour (the legacy tests literally said
|
||||||
|
"we follow the legacy convention for parity").
|
||||||
|
|
||||||
|
**(1) `43d4c67d` WHC-903 electric immersion off-peak HW → SAP 10.2 Table 13 high-rate fraction
|
||||||
|
(47.6 → 48.6%).** The session-4 robust-sweep `whc=903`
|
||||||
lead (median +0.87, n=84).
|
lead (median +0.87, n=84).
|
||||||
- `43d4c67d` **WHC-903 electric immersion off-peak HW → SAP 10.2 Table 13 high-rate fraction.**
|
- `43d4c67d` **WHC-903 electric immersion off-peak HW → SAP 10.2 Table 13 high-rate fraction.**
|
||||||
Was billing 100% at the off-peak low rate; Table 12a "Immersion water heater" row (p.191) routes
|
Was billing 100% at the off-peak low rate; Table 12a "Immersion water heater" row (p.191) routes
|
||||||
|
|
@ -40,13 +57,16 @@ lead (median +0.87, n=84).
|
||||||
carries Table 13's small fraction → matches the over-rating direction; the single mapping
|
carries Table 13's small fraction → matches the over-rating direction; the single mapping
|
||||||
overshot in a prototype (cohort within-0.5 16% → 14%). The description-vs-code-audit lesson
|
overshot in a prototype (cohort within-0.5 16% → 14%). The description-vs-code-audit lesson
|
||||||
again: skeptical of unverified handover code-semantics claims.
|
again: skeptical of unverified handover code-semantics claims.
|
||||||
- **Next robust leads (post-fix sweep, ranked by net directional skew + MEDIAN):** all now
|
- **Next robust leads (post-BOTH-fixes sweep, ranked by net directional skew + MEDIAN):** every
|
||||||
UNDER-rate clusters (negative median = fabric/flat scatter, per-cert not one-bug): `property_type=2`
|
top bucket is now an UNDER-rate cluster (negative median = fabric/flat scatter, per-cert not one-bug):
|
||||||
flats −0.31 (n=283), `wall_construction=3` −0.28 (n=221), roof "(another dwelling above)" −0.32
|
`property_type=2` flats med −0.39 (n=283, netDir +75), roof "(another dwelling above)" −0.46 (n=182),
|
||||||
(n=182), floor "(another dwelling below)" −0.35 (n=185). The remaining OVER-rate buckets are small:
|
`wall_desc="Solid brick, as built, no insulation"` −0.22 (n=114). No clean OVER-rate single-cause
|
||||||
"Cavity wall, as built, insulated" +0.26 (n=145), "Solid, no insulation" +0.13 (n=304). whc=903 has
|
bucket remains (cavity-insulated dropped to −0.13, main_heat_cat=7 to −0.31, whc=903 off the top —
|
||||||
dropped off the top of the sweep. `main_heat_cat=7` electric-storage (median +1.05, n=41) is still
|
all addressed). The flats under-rate is the biggest front but DIFFUSE (fabric/tariff per-cert) — likely
|
||||||
open (tariff/cost; partly artifact) — was the session-4 #2 lead, untouched this session.
|
needs worksheets, not one rule. The 100 unsupported-schema certs remain the deferred big ticket.
|
||||||
|
METHOD NOTE: the cavity win came from splitting the +0.26 bucket BY AGE BAND — the mild median hid a
|
||||||
|
sharp G/H spike. When a description bucket has a modest median but a plausible single mechanism,
|
||||||
|
re-split by age band / sub-field before dismissing it as scatter.
|
||||||
|
|
||||||
**SESSION-4 shipped (45.1 → 47.6%):** four spec-grounded fixes + closed one false lead.
|
**SESSION-4 shipped (45.1 → 47.6%):** four spec-grounded fixes + closed one false lead.
|
||||||
See the `## SESSION-4 …` blocks below and the auto-memory for full detail. The systematic bias
|
See the `## SESSION-4 …` blocks below and the auto-memory for full detail. The systematic bias
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue