mirror of
https://github.com/Hestia-Homes/Model.git
synced 2026-06-08 11:17:27 +00:00
docs: handover for cohort-2 closure + precision-floor next steps
Captures 5 slices shipped this session (S0380.21..25):
- Table 3a rows 1+4 + PCDB keep-hot dispatch
- Per-BP roof exposure (Ext1 flat roof on flats)
- RdSAP §11.1 b) % of roof area PV synthesis
- SAP code 631 → house coal secondary fuel
- SAP codes 2111/2113 → control type 2
Cohort-2 outcome: 22/38 exact (<1e-4), max residual ±0.55 SAP,
0 RAISES, 0 big-gaps. All structural cascade gaps closed.
Open threads diagnosed in detail:
1. Cert 7700 -0.44 SAP — wall U code conflict
(_WALL_INSULATION_NONE=4 vs Elmhurst "As Built"=4). Wider than
a single slice; needs regression testing.
2. Cert 9796 +0.55 SAP — MIT precision floor (Mid-Terrace
bungalow + HP, +0.06°C across all months). Same mechanism as
cohort-1 HP-COP residuals.
3. API-path closure for all 38 certs (deferred).
4. Tighten cohort-1 chain tests to 1e-4 once thread 2 closes.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
474052d303
commit
9547fa1f5f
1 changed files with 313 additions and 0 deletions
|
|
@ -0,0 +1,313 @@
|
|||
# Handover — cohort-2 closure (5 slices shipped) + precision-floor next steps
|
||||
|
||||
Branch `feature/per-cert-mapper-validation`. This session shipped
|
||||
**5 slices** (S0380.21 → S0380.25) closing the bulk of the cohort-2
|
||||
residuals. All RAISES are gone, all ±5+ big-gaps closed. Picks up
|
||||
from `HANDOVER_TABLE_3A_NO_KEEP_HOT.md`.
|
||||
|
||||
**HEAD at handover start:** `36a3219d` (Slice S0380.25: SAP codes
|
||||
2111/2113 are control type 2, not type 3 — closes certs 0652 + 6835).
|
||||
|
||||
## User's stated goal (carried forward verbatim)
|
||||
|
||||
> I've added some more test cases, in the same format, in here:
|
||||
> `sap worksheets/additional with api 2`
|
||||
> We should check that the Elmhurst mapping works and then the api
|
||||
|
||||
Target: **1e-4 across the board** for every cert per
|
||||
[[feedback-one-e-minus-4-across-the-board]] — HPs included.
|
||||
|
||||
API-path closure (cohort-2 API JSON fetch + chain tests + cross-mapper
|
||||
EPC parity) is **still deferred** — Summary path is shippable and
|
||||
well-instrumented; the API path is fetchable but not yet mirrored.
|
||||
|
||||
## Slices shipped this session
|
||||
|
||||
| Slice | Commit | What |
|
||||
|---|---|---|
|
||||
| S0380.21 | `0d3fb980` | Table 3a row 1 + row 4 + PCDB keep-hot dispatch. Closes 9 of 11 cohort-2 RAISES exactly. Re-adds cert `0390-2954-3640-2196-4175` to the golden cohort. |
|
||||
| S0380.22 | `1a25ea67` | Per-BP roof exposure — `roof_construction_type` containing "another dwelling above" suppresses that BP's roof regardless of dwelling-level flag. Closes cert `0036-6325-1100-0063-1226` Ext1 flat roof (+0.30 → -6e-6). |
|
||||
| S0380.23 | `8dee1918` | RdSAP 10 §11.1 b) "% of roof area" PV synthesis — kWp = 0.12 × roof_area_for_heat_loss × pct / cos(35° for pitched). Closes cert `6835-3920-2509-0933-5226` -13.37 → +0.72. |
|
||||
| S0380.24 | `c145953f` | SAP code 631 ("Open fire in grate") → house coal secondary fuel (Table 12 code 11, 3.67 p/kWh). Closes cert `2102-3018-0205-7886-5204` -15.81 → +5e-5. Also narrows gas range to 601-613 per spec. |
|
||||
| S0380.25 | `36a3219d` | SAP codes 2111 ("TRVs and bypass") and 2113 ("Room thermostat and TRVs") are **control type 2** per SAP 10.2 spec page 171 Table 4e, not type 3. Closes certs `0652-3022-1205-2826-1200` (+1.93 → -1e-5) and `6835-3920-2509-0933-5226` (+0.72 → +0.015). |
|
||||
|
||||
All on branch `feature/per-cert-mapper-validation`. Each slice
|
||||
includes unit tests, pyright net-zero on touched files.
|
||||
|
||||
## Cohort-2 distribution at HEAD
|
||||
|
||||
Cohort-2 (38-cert dataset) Summary-path probe:
|
||||
|
||||
| Bucket (\|Δ\|) | Pre-session | Now | Δ |
|
||||
|---|---|---|---|
|
||||
| exact (<1e-4) | 10 | **22** | **+12** |
|
||||
| 1e-4..0.07 | 13 | **14** | +1 |
|
||||
| 0.07..0.5 | 2 | **1** | -1 |
|
||||
| 0.5..1 | 1 | **1** | = |
|
||||
| 1..5 | 0 | **0** | = |
|
||||
| >5 | 1 | **0** | -1 |
|
||||
| **RAISES (PCDB)** | 11 | **0** | **-11** |
|
||||
|
||||
Cohort-1 (7-ASHP + 2 newer) untouched: all still at ±0.04 SAP. No
|
||||
regressions from any slice.
|
||||
|
||||
## ★ Open threads with diagnoses (priority order)
|
||||
|
||||
### 1. Cert 7700-3362-0922-7022-3563 (-0.44 SAP, gas PCDF 17741)
|
||||
|
||||
**Diagnosed root cause — code conflict:**
|
||||
|
||||
`heat_transmission.py:88` defines `_WALL_INSULATION_NONE = 4` —
|
||||
heat_transmission treats `wall_insulation_type = 4` as "no insulation
|
||||
present" (cascade routes through `u_wall` uninsulated branch).
|
||||
|
||||
But `mapper.py:2064-2073` maps Elmhurst `"A As Built"` insulation code
|
||||
to SAP10 enum value **4** ("As built / assumed (default cascade)") —
|
||||
the mapper's intent is "use cascade defaults for age-band +
|
||||
construction" (which for an OLD cavity wall means uninsulated → U=1.50
|
||||
age C). The two interpretations happen to agree for cavity walls but
|
||||
disagree for solid + other constructions.
|
||||
|
||||
For cert 7700's alt wall (cavity + "As Built"):
|
||||
- Mapper sets `wall_insulation_type = 4` (intent: use defaults)
|
||||
- Cascade interprets 4 as "no insulation" → `u_wall` returns 1.50
|
||||
- Worksheet uses U=1.20 for the same wall (Table 16 cavity intermediate
|
||||
thickness OR an Elmhurst-specific midpoint)
|
||||
|
||||
Cascade walls = 75.62 W/K; worksheet (29a) sum = 71.29 W/K; Δ +4.33.
|
||||
That's almost the entire fabric (33) gap (148.72 - 144.38 = +4.34).
|
||||
And the entire +0.44 SAP residual.
|
||||
|
||||
**Why this is wider than a single slice:**
|
||||
|
||||
`_WALL_INSULATION_NONE = 4` is also used at line 568 for the MAIN BP
|
||||
walls path (not just alt). Changing the enum mapping touches both the
|
||||
main + alt wall paths. Cohort-1 + cohort-2 certs may rely on the
|
||||
current behavior (e.g. cert 0036 closes exactly with the current
|
||||
mapping, so its main wall + alt wall both happen to fall in the
|
||||
right branches).
|
||||
|
||||
**Suggested approach:**
|
||||
- Audit Table 6 / Table 16 for cavity walls — what's the spec-correct
|
||||
U for "As Built, age C, no measured thickness"? Worksheet's 1.20
|
||||
isn't an obvious Table 16 row.
|
||||
- Consider adding a separate `is_as_built: bool` flag on
|
||||
`SapAlternativeWall` rather than overloading
|
||||
`wall_insulation_type=4` for two meanings.
|
||||
- Or: rename the constant to `_WALL_INSULATION_AS_BUILT = 4` and
|
||||
verify cohort 1 + cohort 2 regressions.
|
||||
- Cert 7700's main wall U (cascade 0.53 vs worksheet 0.70) is ALSO
|
||||
off — same root cause likely.
|
||||
|
||||
### 2. Cert 9796-3058-6205-0346-9200 (+0.55 SAP, ASHP PCDF 104568)
|
||||
|
||||
**Diagnosed — no single bug:**
|
||||
|
||||
Cascade matches worksheet exactly on:
|
||||
- Fabric heat loss (33) = 62.03 W/K ✓
|
||||
- Ventilation (38) = 47.87 W/K Jan ✓
|
||||
- Internal gains (73) = 429.85 W Jan ✓ (full cert_to_inputs path)
|
||||
- Solar gains (83) = 65.44 W Jan ✓
|
||||
- PV generation = 1493.88 vs worksheet 1492.33 (Δ <0.1%)
|
||||
|
||||
But MIT (92) Jan: cascade **18.51** vs worksheet **18.45** → Δ
|
||||
+0.06°C. Consistent +0.05..+0.09°C offset across all months.
|
||||
|
||||
This is the "Appendix N3.6 PSR-precision floor" residual the older
|
||||
handover described — except the user rejects that framing per
|
||||
[[feedback-one-e-minus-4-across-the-board]]. Cohort-1 ASHP certs hit
|
||||
+0.001..+0.04 SAP with similar mechanism; cert 9796 is at +0.55.
|
||||
|
||||
**Why cert 9796 is an outlier:**
|
||||
|
||||
It's the only **Mid-Terrace bungalow** with PCDF 104568 in the cohort.
|
||||
Other PCDF 104568 certs (4800, 2800, 3336) are End-Terrace bungalows
|
||||
and close to <0.04 SAP. Possibly the residual scales with party-wall
|
||||
count or some interaction with extended-heating allocation. Worth
|
||||
checking whether the cascade's `_zone_mean_temp_with_per_zone_eta` η
|
||||
calculation drifts at this particular HLC/PSR/storey combination.
|
||||
|
||||
**Suggested next step:** Pin η for cert 9796 line-by-line against
|
||||
worksheet (86)/(89) — η_living + η_elsewhere — and trace where the
|
||||
~0.005 difference enters.
|
||||
|
||||
### 3. HP-COP residual on 10 triple-glazed HP certs (+0.001..+0.04 SAP)
|
||||
|
||||
Same precision-floor mechanism as cert 9796 but smaller. Cohort-1 ASHP
|
||||
chain tests are currently pinned at `_ASHP_COHORT_CHAIN_TOLERANCE
|
||||
= 0.07`. Tightening to 1e-4 requires closing the MIT precision floor.
|
||||
|
||||
**Suggested approach:** Once cert 9796 root cause is found, the same
|
||||
fix likely tightens these.
|
||||
|
||||
### 4. API-path closure for all 38 cohort-2 certs
|
||||
|
||||
User's longstanding goal. Process:
|
||||
1. Fetch + persist JSON via `EpcClientService._fetch_certificate` (token in
|
||||
`backend/.env` as `OPEN_EPC_API_TOKEN`).
|
||||
2. Mirror Summary chain tests on the API path
|
||||
(`backend/documents_parser/tests/test_summary_pdf_mapper_chain.py`
|
||||
pattern).
|
||||
3. Cross-mapper EPC parity (Summary EPC ≡ API EPC for load-bearing
|
||||
fields) — user's longstanding north star.
|
||||
|
||||
### 5. Tighten cohort-1 ASHP chain tests to 1e-4
|
||||
|
||||
Once thread 3 closes, drop the ±0.07 tolerance pin in
|
||||
`backend/documents_parser/tests/test_summary_pdf_mapper_chain.py
|
||||
::_ASHP_COHORT_CHAIN_TOLERANCE`.
|
||||
|
||||
## Methodology — preserved conventions
|
||||
|
||||
Carried forward unchanged from prior sessions:
|
||||
|
||||
- **1e-4 across the board** ([[feedback-one-e-minus-4-across-the-board]])
|
||||
— HP certs target the same precision as boilers; reject any
|
||||
"calculator precision floor" framing.
|
||||
- **Worksheet, not API, is the target** ([[feedback-worksheet-not-api-reference]]).
|
||||
- **One slice = one commit; stage by name** ([[feedback-commit-per-slice]]).
|
||||
- **AAA test convention** with literal `# Arrange / # Act / # Assert`
|
||||
([[feedback-aaa-test-convention]]).
|
||||
- **`abs(diff) <= tol`** not `pytest.approx` ([[feedback-abs-diff-over-pytest-approx]]).
|
||||
- **Spec citation in commit messages** ([[feedback-spec-citation-in-commits]]).
|
||||
- **Strict-enum raises on unmapped labels / unresolved cascade dispatch**
|
||||
(Slices S0380.15, S0380.17, S0380.20 established the pattern).
|
||||
- **Pyright net-zero per file**.
|
||||
|
||||
## Test baseline at HEAD
|
||||
|
||||
```bash
|
||||
PYTHONPATH=/workspaces/model python -m pytest \
|
||||
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
|
||||
backend/documents_parser/tests/test_elmhurst_extractor.py \
|
||||
backend/documents_parser/tests/test_elmhurst_end_to_end.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_water_heating.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_mean_internal_temperature.py \
|
||||
domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
|
||||
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
|
||||
domain/sap10_calculator/tests/test_pcdb_table_362_lookup.py \
|
||||
domain/sap10_ml/tests/test_rdsap_uvalues.py \
|
||||
datatypes/epc/schema/tests/test_schema_loading.py \
|
||||
--no-cov -q
|
||||
```
|
||||
|
||||
Expected: **704 pass + 10 pre-existing fails** (9 × cert 001479 Layer 1
|
||||
hand-built skeleton + 1 × pre-existing FEE round-trip).
|
||||
|
||||
Pyright per-file baselines (touched files; net-zero on each):
|
||||
- `datatypes/epc/domain/mapper.py`: 32
|
||||
- `datatypes/epc/surveys/elmhurst_site_notes.py`: 0
|
||||
- `backend/documents_parser/elmhurst_extractor.py`: 0
|
||||
- `backend/documents_parser/tests/test_summary_pdf_mapper_chain.py`: 0
|
||||
- `domain/sap10_calculator/rdsap/cert_to_inputs.py`: 35
|
||||
- `domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py`: 13
|
||||
- `domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py`: 1
|
||||
- `domain/sap10_calculator/worksheet/water_heating.py`: 1
|
||||
- `domain/sap10_calculator/worksheet/heat_transmission.py`: 13
|
||||
- `domain/sap10_calculator/worksheet/tests/test_water_heating.py`: 94
|
||||
- `domain/sap10_calculator/worksheet/tests/test_heat_transmission.py`: 71
|
||||
|
||||
## Diagnostic probe script (carried forward from prior handover)
|
||||
|
||||
```bash
|
||||
PYTHONPATH=/workspaces/model python <<'PY'
|
||||
import re, subprocess
|
||||
from collections import defaultdict
|
||||
from pathlib import Path
|
||||
from backend.documents_parser.tests.test_summary_pdf_mapper_chain import _summary_pdf_to_textract_style_pages
|
||||
from backend.documents_parser.elmhurst_extractor import ElmhurstSiteNotesExtractor
|
||||
from datatypes.epc.domain.mapper import EpcPropertyDataMapper, UnmappedElmhurstLabel
|
||||
from domain.sap10_calculator.rdsap.cert_to_inputs import (
|
||||
cert_to_inputs, SAP_10_2_SPEC_PRICES, UnresolvedPcdbCombiLoss,
|
||||
)
|
||||
from domain.sap10_calculator.calculator import calculate_sap_from_inputs
|
||||
|
||||
src_root = Path('/workspaces/model/sap worksheets/additional with api 2')
|
||||
buckets = defaultdict(list)
|
||||
def bucket(d):
|
||||
a = abs(d)
|
||||
if a < 1e-4: return "exact"
|
||||
if a < 0.07: return "<=0.07"
|
||||
if a < 0.5: return "0.07..0.5"
|
||||
if a < 1: return "0.5..1"
|
||||
if a < 5: return "1..5"
|
||||
return "5+"
|
||||
for cd in sorted(src_root.iterdir()):
|
||||
if not cd.is_dir() or cd.name.startswith('.'): continue
|
||||
sp = next(cd.glob("Summary_*.pdf"), None)
|
||||
ws_pdf = next(cd.glob("dr87-*.pdf"), None)
|
||||
if not (sp and ws_pdf): continue
|
||||
out = subprocess.run(["pdftotext", str(ws_pdf), "-"], capture_output=True, text=True).stdout
|
||||
m = re.search(r"SAP value\s*\n?\s*([\d.]+)", out)
|
||||
ws_sap = float(m.group(1)) if m else None
|
||||
try:
|
||||
sn = ElmhurstSiteNotesExtractor(_summary_pdf_to_textract_style_pages(sp)).extract()
|
||||
epc = EpcPropertyDataMapper.from_elmhurst_site_notes(sn)
|
||||
r = calculate_sap_from_inputs(cert_to_inputs(epc, prices=SAP_10_2_SPEC_PRICES))
|
||||
d = r.sap_score_continuous - ws_sap
|
||||
buckets[bucket(d)].append((cd.name, d))
|
||||
except UnresolvedPcdbCombiLoss as e:
|
||||
buckets["RAISES (Pcdb)"].append((cd.name, e.pcdf_index))
|
||||
except UnmappedElmhurstLabel as e:
|
||||
buckets["RAISES (Elm)"].append((cd.name, str(e)))
|
||||
|
||||
for b in ("exact", "<=0.07", "0.07..0.5", "0.5..1", "1..5", "5+", "RAISES (Pcdb)", "RAISES (Elm)"):
|
||||
if b in buckets:
|
||||
print(f"\n[{b}] {len(buckets[b])}:")
|
||||
for c, d in buckets[b]:
|
||||
print(f" {c} {d}")
|
||||
PY
|
||||
```
|
||||
|
||||
Mirror against `/workspaces/model/sap worksheets/Additional data with api`
|
||||
for cohort-1 cross-checks.
|
||||
|
||||
## Memory references
|
||||
|
||||
Cross-session memories load automatically. Key ones for this work:
|
||||
|
||||
- [[feedback-one-e-minus-4-across-the-board]] — user target is 1e-4 for HPs too.
|
||||
- [[project-instantaneous-shower-cascade-gap]] — closed by S0380.21.
|
||||
- [[project-summary-path-cohort-closure]] — original 7-cert ASHP cohort context.
|
||||
- [[feedback-worksheet-not-api-reference]] — Summary path pins to worksheet, not API.
|
||||
- [[feedback-cascade-pin-methodology]] — test the actual cascade against PDF line refs.
|
||||
- [[reference-sap10-spec-docs]] — full BRE technical paper set at
|
||||
`domain/sap10_calculator/docs/specs/`.
|
||||
- [[feedback-commit-per-slice]] / [[feedback-aaa-test-convention]] /
|
||||
[[feedback-abs-diff-over-pytest-approx]] / [[feedback-spec-citation-in-commits]] /
|
||||
[[feedback-worksheet-shape-fidelity]] / [[feedback-zero-error-strict]] —
|
||||
slicing + test conventions.
|
||||
|
||||
## First concrete actions for next agent
|
||||
|
||||
1. **Re-run the diagnostic probe** to confirm baseline reproduces
|
||||
(22 exact + 14 ≤±0.07 + 1 ±0.07..0.5 + 1 ±0.5..1 + 0 RAISES).
|
||||
|
||||
2. **Investigate cert 7700 wall-U code conflict** (thread 1).
|
||||
Concrete steps:
|
||||
- Read `heat_transmission.py:80-95` (constant block) +
|
||||
`heat_transmission.py:560-580` (main wall path) +
|
||||
`heat_transmission.py:878-905` (`_alt_wall_w_per_k`).
|
||||
- Read `mapper.py:2064-2073` (insulation enum) +
|
||||
`mapper.py:2866-2887` (`_map_elmhurst_alternative_wall`).
|
||||
- Probe the worksheet's U=1.20 for cert 7700 alt wall against
|
||||
RdSAP 10 spec Table 16 (cavity walls) — figure out which row
|
||||
matches and why the cascade picks 1.50.
|
||||
- Probe cert 7700 main wall U=0.70 (cascade) vs worksheet 0.70 — does
|
||||
the main path have a similar precision issue?
|
||||
- **Critically**: run the full diagnostic probe with any proposed
|
||||
fix to confirm cohort-1 + the 22 exact cohort-2 certs don't
|
||||
regress.
|
||||
|
||||
3. **Investigate cert 9796 MIT precision residual** (thread 2). Likely
|
||||
needs line-by-line η pinning at the Mid-Terrace-bungalow scale.
|
||||
|
||||
4. **API path** — fetch + persist the 38-cert JSON via
|
||||
`EpcClientService._fetch_certificate`. Pattern follows
|
||||
`domain/sap10_calculator/rdsap/tests/fixtures/golden/*.json`. Token
|
||||
in `backend/.env` as `OPEN_EPC_API_TOKEN`.
|
||||
|
||||
Good luck. The Summary-path cohort is in very strong shape (22/38
|
||||
exact; max residual ±0.55 SAP). The remaining residuals are
|
||||
precision-floor concerns rather than structural cascade bugs.
|
||||
Loading…
Add table
Reference in a new issue