mirror of
https://github.com/Hestia-Homes/Model.git
synced 2026-06-08 11:17:27 +00:00
docs: refresh handover + cert 0240 notes after Slice 95
Status: Slice 95 closed Layer 4 (API → cascade SAP) on cert 001479 at < 1e-4 vs worksheet 69.0094. Production goal MET; the `test_api_001479_full_chain_sap_matches_worksheet_pdf_exactly` test formalises this gate. Updates to keep the next agent honest: - NEXT_AGENT_PROMPT: header + status table + cumulative SAP delta table + "First action" + epilogue all reflect Slice 95's close-out. - NEXT_AGENT_PROMPT §4 (Outlier golden cert investigations): rewrote the cert 0240 entry. The earlier "Type-1 RR gable_wall_lengths not extracted" claim is stale — mapper.py:1349-1369 already extracts them (Slices 71-86). The -15 SAP residual is a mix, dominated by the windows subsystem (11 windows × 18.28 m² with default U≈2.27 because Slice 93's `_API_GLAZING_TYPE_TO_TRANSMISSION` only covers glazing codes 3 and 13; cert 0240 lodges code 2). Surfacing glazing_type=2 (and likely other unmapped codes) is the biggest single-slice leverage point — and would touch 6035 too. - test_golden_fixtures.py cert 0240 `notes:` field: replaced the stale RR hypothesis with the actual cascade subsystem breakdown and the glazing_type-2 surfacing recommendation. No production code changed; docs and a `_GoldenExpectation.notes` string only. test_golden_fixtures.py stays GREEN (14 passed). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
f502db8c74
commit
b2c6a57247
2 changed files with 87 additions and 51 deletions
|
|
@ -1,9 +1,10 @@
|
|||
# Handover — API mapper at 1e-3 on cert 001479, closing to 1e-4
|
||||
# Handover — API mapper at 1e-4 on cert 001479; investigating goldens
|
||||
|
||||
You are picking up branch `ara-backend-design-prd`. The cert 001479 API
|
||||
path is at SAP delta **+0.0006** (was +3.08); fabric heat loss is
|
||||
EXACT. The remaining work is closing the sub-1e-3 gap and validating
|
||||
against more cert pairs.
|
||||
path now hits the worksheet's continuous SAP 69.0094 **at < 1e-4**
|
||||
(Slice 95). Layer 4 production goal is MET. Remaining work: investigate
|
||||
golden cert residual outliers (especially cert 0240's -15 SAP) and
|
||||
process any new (Summary + API) cert pairs the user sources.
|
||||
|
||||
## The end goal (re-confirmed by the user)
|
||||
|
||||
|
|
@ -36,8 +37,8 @@ Layer 4: API mapper cascade SAP = worksheet SAP at 1e-4 (production goal)
|
|||
|---|---|
|
||||
| **1 — hand-built cascade pin** | ✅ 6 cohort certs (000474, 000477, 000480, 000487, 000490, 000516) GREEN at 1e-4; cert 001479 hand-built skeleton (Slice 62) still RED (2 of 11 pins green, hand-built has its own bugs — orthogonal to the production path) |
|
||||
| **2 — Elmhurst-mapped path** | ✅ **Cert 001479 GREEN at 1e-4** (Slice 89); cohort: 2 GREEN (000477, 000516), 4 RED (000474, 000480, 000487, 000490 — Elmhurst U985 worksheets violate the RdSAP 10 §5 (12) spec; orthogonal to the production goal) |
|
||||
| **3 — API-mapped ≡ Elmhurst-mapped (field-level)** | 🟡 Cascade outputs match within 1e-3 SAP; field-level diff test not yet written |
|
||||
| **4 — API path cascade SAP** | 🟡 **Cert 001479 at +0.0006 SAP delta from worksheet** (was +3.08); 9 other golden certs pinned at residual-from-integer at tolerance 0 |
|
||||
| **3 — API-mapped ≡ Elmhurst-mapped (field-level)** | 🟡 Cascade outputs match at 1e-4 (Slice 95); field-level diff test not yet written but lower priority since cascade-output gate exists |
|
||||
| **4 — API path cascade SAP** | ✅ **Cert 001479 GREEN at 1e-4** (Slice 95). `test_api_001479_full_chain_sap_matches_worksheet_pdf_exactly` formalises the gate. 8 other golden certs pinned at residual-from-integer at tolerance 0 |
|
||||
|
||||
## Cumulative API SAP delta progression (cert 001479)
|
||||
|
||||
|
|
@ -57,7 +58,8 @@ spec rule's premise to be met. Each slice closed one gap:
|
|||
| 91 | descriptive strings via int→str lookups (`floor_construction_type`, `roof_construction_type`) + pre-1950 PS sloping → thickness=0 + per-bp roof description fix | +1.0970 |
|
||||
| 92 | upper-floor `room_height_m += 0.25` + `is_exposed_floor` from `floor_heat_loss==1` + `floor_insulation_thickness="NI"→None` | +1.0022 |
|
||||
| 93 | `window_transmission_details` from `glazing_type` int (code 3 → U=2.8/g=0.76, code 13 → U=1.4/g=0.72) | +1.1846 |
|
||||
| 94 | `sheltered_sides` from API `built_form` + `floor_type` from `floor_heat_loss==7` | **+0.0006** |
|
||||
| 94 | `sheltered_sides` from API `built_form` + `floor_type` from `floor_heat_loss==7` | +0.0006 |
|
||||
| 95 | API mapper `total_floor_area_m2` = Σ per-bp dims (worksheet-precise 68.51 not lodged-rounded 69) + RdSAP 10 §15 p.66 window 2dp area rounding in solar_gains/internal_gains | **< 1e-4** |
|
||||
|
||||
Fabric breakdown for cert 001479 API path is now COMPLETELY EXACT
|
||||
(all 6 components match worksheet to 4 d.p.):
|
||||
|
|
@ -160,20 +162,47 @@ Each new pair lands as a 1e-4 cascade-pin test. Pattern: ~3-5 new
|
|||
mapper bugs per cert pair (similar to Slice 87-94 on 001479). Each
|
||||
becomes its own slice. Stage by name; one slice = one commit.
|
||||
|
||||
### 4. Investigate goldenz with shifted residuals after Slices 87-94
|
||||
### 4. Investigate goldens with shifted residuals after Slices 87-95
|
||||
|
||||
The Slice 87-94 fixes shifted residuals on 7 of 10 API-only golden
|
||||
certs. The new residuals are pinned. Outliers that need attention:
|
||||
Slices 87-94 shifted residuals on 7 of 10 API-only golden certs;
|
||||
Slice 95 (precise TFA + window 2dp area rounding) shifted 5 more
|
||||
(0240, 6035, 8135, 2130, 0390-2254). All residuals are re-pinned.
|
||||
Current outliers and what we now know:
|
||||
|
||||
- **0240** (-14): documented RR mapper gap (`'Roof room(s),
|
||||
insulated (assumed)'` description not parsed; Type-1 RR
|
||||
gable_wall_lengths not extracted)
|
||||
- **0390-2954** (-6): large detached, age F, oil — likely a heating
|
||||
efficiency cascade gap
|
||||
- **6035** (-6): mid-terrace age A — possibly party wall config or
|
||||
ventilation issue
|
||||
|
||||
These are tractable once you have a worksheet for any of them.
|
||||
- **0240** (-15 SAP, +17.8 PE): Detached age J + RR + 11 windows. The
|
||||
earlier handover claim of "RR mapper gap" is **partly stale**:
|
||||
- `room_in_roof_type_1.gable_wall_length_1/2` ARE extracted by the
|
||||
21.0.1 mapper (see mapper.py:1349-1369 — must have landed in
|
||||
Slices 71-86). Cert 0240's RR cascades through with floor_area=
|
||||
83.2, gables 6.4 + 6.4, age J → U_RR = 0.30 W/m²K.
|
||||
- `'Roof room(s), insulated (assumed)'` description NOT parsed —
|
||||
but the spec basis for parsing it is unclear: age J's Table 18
|
||||
col(4) default already models insulation (U=0.30), and unlike
|
||||
the regular-roof "insulated (assumed)" → 50 mm bucket rule
|
||||
(RdSAP §5.11.4), no equivalent rule for RR has been identified.
|
||||
- The -15 SAP residual is a mix, not a single RR gap. Subsystem
|
||||
breakdown for cert 0240 (via cert_to_inputs cascade):
|
||||
- walls 22.95, party 0, roof 76.93 (incl RR ~18.5), floor 29.43,
|
||||
windows 41.55, doors 11.10, bridging 39.64; total HLC 221.6 W/K
|
||||
- **windows_w_per_k = 41.55 is the most leverageable**: 11
|
||||
windows × 18.28 m² × U_default ≈ 2.27 W/m²K. Cert lodges
|
||||
`glazing_type=2` for all windows but Slice 93's
|
||||
`_API_GLAZING_TYPE_TO_TRANSMISSION` only covers codes 3 and 13;
|
||||
surfacing code 2 would land a measurable U (likely ~1.8-2.0)
|
||||
and close several W/K of fabric loss.
|
||||
- Other potential gains: BP[0] non-RR ceiling lodges "Pitched,
|
||||
400+ mm loft insulation" (should U ~0.10); verify cascade
|
||||
gives it that.
|
||||
- **Net**: cert 0240 is not a single-slice fix; it's 3-5
|
||||
progressive mapper improvements (glazing_type 2 surfacing,
|
||||
possibly more glazing codes, possibly RR description nuance).
|
||||
- **0390-2954** (-6 SAP, -26.5 PE): large detached F (TFA 360), oil
|
||||
PCDB-listed. Undocumented. PE going more negative than SAP suggests
|
||||
the cost cascade is hitting harder than energy — possibly oil
|
||||
price/efficiency interaction.
|
||||
- **6035** (-6 SAP, +49.5 PE): mid-terrace age A + RR. Probably has
|
||||
the same glazing_type-default-U issue as 0240 plus an age-A-
|
||||
specific gap.
|
||||
|
||||
### 5. (deferred) Cohort chain test RED triage
|
||||
|
||||
|
|
@ -216,9 +245,10 @@ override field, (c) wait for more cert pairs to confirm pattern.
|
|||
worksheets (000474, 000477, 000480, 000487, 000490, 000516).
|
||||
- `sap worksheets/U985-0001-NNNNNN.txt` × 6 — text exports of above.
|
||||
|
||||
## Recent slice history (Slices 87-94, current branch)
|
||||
## Recent slice history (Slices 87-95, current branch)
|
||||
|
||||
```
|
||||
f502db8c Slice 95: API mapper TFA from per-bp dims + window area 2dp rounding — cert 001479 to 1e-4
|
||||
03203418 Slice 94: API mapper sheltered_sides + floor_type — cert 001479 to 1e-3
|
||||
7281b7b3 Slice 93: API mapper window_transmission_details from glazing_type
|
||||
8e752e57 Slice 92: API mapper floor dimensions (SAP +0.25m + exposed-floor + NI→None)
|
||||
|
|
@ -237,8 +267,8 @@ before this rewrite).
|
|||
|
||||
## First action
|
||||
|
||||
1. Confirm branch state matches `git log --oneline -1` →
|
||||
`03203418` Slice 94.
|
||||
1. Confirm branch state — Slice 95 (`f502db8c`) closed cert 001479 to
|
||||
< 1e-4 (was +0.0006 after Slice 94). Layer 4 is GREEN.
|
||||
2. Run the full sweep:
|
||||
```bash
|
||||
PYTHONPATH=/workspaces/model:/workspaces/model/packages/domain/src \
|
||||
|
|
@ -247,23 +277,25 @@ before this rewrite).
|
|||
packages/domain/src/domain/sap/rdsap/tests/test_golden_fixtures.py \
|
||||
--no-cov -q
|
||||
```
|
||||
Expect ~75 passed / ~16 failed. The 9 failures on
|
||||
`test_sap_result_pin[001479-*]` (cohort cascade for the hand-built
|
||||
skeleton) and 4 cohort chain RED + 3 cohort diff RED are
|
||||
pre-existing.
|
||||
3. Run the API → Summary diff probe (script in §1 above) to surface
|
||||
the remaining sub-1e-3 SAP gap. Likely candidates ranked by impact:
|
||||
- Infiltration (-2 ACH/yr) → check `ventilation_from_cert()`
|
||||
intermediate outputs for both paths
|
||||
- HW kWh (+6.7) → check shower outlet count + Appendix J §1a path
|
||||
- Internal gains (+25.7 W·months) → check pumps_fans + bulb counts
|
||||
Expect **99 passed / 19 failed**. All 19 failures pre-existing:
|
||||
9× hand-built 001479 skeleton (`test_sap_result_pin[001479-*]`),
|
||||
6× cohort diff (`test_from_elmhurst_site_notes_matches_hand_built_*`),
|
||||
4× cohort chain (000474/000480/000487/000490 — Elmhurst non-spec).
|
||||
3. Production goal is met for cert 001479. Next work focuses on the
|
||||
golden cert residual outliers (§4 above) and new (Summary + API)
|
||||
cert pairs from the user. The diff-probe methodology from Slice 95
|
||||
(cascade-component diff API vs Summary path; localise; fix mapper)
|
||||
works for any new (Summary + API) pair — worksheet not required
|
||||
when Summary path is established as canonical.
|
||||
4. Don't lose sight of Layer 4: **API → SAP within 1e-4 of worksheet
|
||||
continuous on cert 001479** is the production goal. Currently
|
||||
delta +0.0006.
|
||||
continuous on cert 001479** is the production goal. **MET as of
|
||||
Slice 95** — `test_api_001479_full_chain_sap_matches_worksheet_pdf_
|
||||
exactly` formalises this gate.
|
||||
|
||||
Good luck. The user is sourcing more cert pairs in parallel; when
|
||||
they arrive, each one will surface 3-5 mapper bugs along the same
|
||||
pattern as Slices 87-94. The diagnostic methodology that worked here
|
||||
(diff Summary-mapper vs API-mapper; localise by cascade component;
|
||||
fix the API mapper to mirror the Summary's surfacing) will work
|
||||
again.
|
||||
The user is sourcing more cert pairs in parallel; when they arrive,
|
||||
each one will surface ~3-5 mapper bugs along the same pattern as
|
||||
Slices 87-95. The diagnostic methodology (diff Summary-mapper vs
|
||||
API-mapper; localise by cascade component; fix the API mapper to
|
||||
mirror the Summary's surfacing) works for any new (Summary + API)
|
||||
pair — worksheet not required when Summary path is canonical (cert
|
||||
001479 proves it is).
|
||||
|
|
|
|||
|
|
@ -78,17 +78,21 @@ _EXPECTATIONS: tuple[_GoldenExpectation, ...] = (
|
|||
expected_pe_resid_kwh_per_m2=+17.8450,
|
||||
expected_co2_resid_tonnes_per_yr=+1.0097,
|
||||
notes=(
|
||||
"Detached house, TFA 202, age J, oil boiler, Table 4b code 130. "
|
||||
"API response lodges sap_room_in_roof.room_in_roof_type_1 with "
|
||||
"gable_wall_length_1/2 + 'Roof room(s), insulated (assumed)' "
|
||||
"description; our mapper doesn't yet extract these. Until it "
|
||||
"does, the Simplified Type 1 RR fallback at U_RR_default ages "
|
||||
"J = 0.30 W/m²K + ΣA_RR_gable/other = 0 over-counts the RR's "
|
||||
"real heat loss (the cert has retrofit insulation). Pre-RR-fix "
|
||||
"(commits b01164a2..1928e5a2) this cert coincidentally landed "
|
||||
"at Δ=0 because RR contribution was missing entirely. Returns "
|
||||
"to Δ≈0 once the mapper extracts gable lengths + parses the "
|
||||
"description's '50mm retrofit' signal (handover ticket)."
|
||||
"Detached house, TFA 118, age J, oil boiler PCDB-listed + PV + "
|
||||
"RR on BP[0]. Mapper DOES extract sap_room_in_roof.room_in_roof_"
|
||||
"type_1.gable_wall_length_1/2 (mapper.py:1349) and applies "
|
||||
"U_RR_J=0.30 via u_rr_default_all_elements — the earlier "
|
||||
"handover claim of 'gable_wall_lengths not extracted' is stale. "
|
||||
"Subsystem diff against the cascade: walls 22.95 / roof 76.93 / "
|
||||
"floor 29.43 / windows 41.55 / doors 11.10 / bridging 39.64 "
|
||||
"(total HLC 221.6 W/K). Biggest leverage is windows: 11 windows "
|
||||
"× 18.28 m² × U_default≈2.27 because cert lodges glazing_type=2 "
|
||||
"and Slice 93's _API_GLAZING_TYPE_TO_TRANSMISSION only covers "
|
||||
"codes 3 and 13. Surfacing code 2 → measurable U≈1.8-2.0 would "
|
||||
"close several W/K. Other candidates: BP[0] non-RR ceiling lodges "
|
||||
"'Pitched, 400+ mm loft insulation' — verify cascade U; possibly "
|
||||
"RR description-implied insulation nuance (spec basis unclear "
|
||||
"for RR — unlike regular roofs which have the §5.11.4 50mm rule)."
|
||||
),
|
||||
),
|
||||
_GoldenExpectation(
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue