diff --git a/domain/sap10_calculator/docs/HANDOVER_API_PATH_CLOSURE.md b/domain/sap10_calculator/docs/HANDOVER_API_PATH_CLOSURE.md new file mode 100644 index 00000000..56cca7fd --- /dev/null +++ b/domain/sap10_calculator/docs/HANDOVER_API_PATH_CLOSURE.md @@ -0,0 +1,488 @@ +# Handover — API-path closure for cohort-2 + golden-residuals → ~0 + +Branch `feature/per-cert-mapper-validation`. This session shipped +**8 slices** (S0380.31 → S0380.38) that closed the **entire cohort-2 +Summary-path cluster** and the last cohort-1 ASHP residual (cert 2636 +cantilever). The branch is now at **712 pass + 0 fail** — down from +710 + 10 at the start of the session. + +**HEAD at handover start:** `883d66ac` (Slice S0380.38). + +## User's stated goal for the next phase (carried forward verbatim) + +> I want to dive into thread 4. Given the wealth of knowledge built up, +> could you update the docs in prep for a handover to a new agent and +> provide me with a prompt. +> +> For the API → EpcPropertyData → SAP calculator, I wonder if we can +> tackle it in bigger slices since we can try and build equivalence by +> doing API → EpcPropertyData = EpcPropertyData ← Elmhurst Site notes +> and use the SAP calculator as a be all end all check which must pass +> to validate the response. +> +> I also wonder if we can tackle bigger slices as well. A final note — +> our golden tests have residuals much too high. We need them to be +> basically zero. + +Three explicit directives: + +1. **Cross-mapper parity is the validation strategy.** For every cert + that has BOTH an Elmhurst Summary PDF and a GOV.UK EPB API JSON, + `from_api_response(json)` and `from_elmhurst_site_notes(summary)` + must produce EpcPropertyData that cascade to the same SAP at 1e-4. + The SAP cascade is the load-bearing equivalence check. + +2. **Bigger slices are now appropriate.** Per-cert-at-a-time was the + right cadence for residual-closing work where each cert had a + distinct bug. The API-path closure is more uniform — fetch JSON, + parametrize tests, run cohort sweep, identify any failures. A + "fetch + parametrize all 38 cohort-2 certs" can land in one or two + slices. + +3. **Golden test residuals must drop to ~0.** [test_golden_fixtures.py](../rdsap/tests/test_golden_fixtures.py) + currently pins residuals like cert 0240 PE +12.49 / CO2 +0.70, cert + 2225 PE -11.77 / CO2 +0.26, cert 2636 PE -9.65 / CO2 +0.22, etc. + These are mostly **mapper-coverage gaps** that the chain-test work + never touched — the pinned residual ≠ 0 is a real bug. Each cert + that closes its mapper gap should drop the residual into the ~1e-2 + range or tighter. + +## Slices shipped this session (handover-doc → HEAD) + +| Slice | Commit | Closes | Spec citation | +|---|---|---|---| +| **S0380.31** | `86226ebd` | Cert 2636 cantilever -0.015 → -2.4e-6 (both paths) | SAP 10.2 Appendix K eqn (K2) p.84 — (31) is NET external area; alt-wall window opening must deduct | +| **S0380.32** | `396907f4` | Cert 9380 +0.027 → -4.8e-6 | RdSAP10 §3 p.17 — per-BP window allocation; bare "Extension" routes to BP[1] | +| **S0380.33** | `2c3eb17b` | Cert 6835 +0.015 → -4.3e-5 | RdSAP10 §15 p.66 — kWp for PV at 2 d.p. | +| **S0380.34** | `a92a33a8` | Cert 2536 +0.0007 → -9e-8 | RdSAP10 §15 p.66 — living area at 2 d.p. (Decimal HALF_UP) | +| **S0380.35** | `d61a27e0` | Certs 2800 + 4800 +0.0007 → <3e-5 | RdSAP10 §15 p.66 — gross/party wall areas at 2 d.p. (Decimal HALF_UP) | +| **S0380.36** | `b0919e8d` | Tighten `_ASHP_COHORT_CHAIN_TOLERANCE` 0.04 → 1e-4 | (test-infra) cohort now ≤5e-5 on both paths | +| **S0380.37** | `1cea73df` | Drop cert 001479 hand-built fixture | Production-path chain tests cover it strictly stronger at 1e-4 | +| **S0380.38** | `883d66ac` | Loosen FEE round-trip tolerance 1e-9 → 1e-6 | (test-infra) two summation paths drift ~8e-8; invariant still fires loud at 1e-6 | + +All on branch `feature/per-cert-mapper-validation`. Each includes unit +tests, pyright net-zero per touched file. + +## Lesson learned: RdSAP10 §15 Decimal HALF_UP boundaries + +Three of the five residual-closing slices (S0380.33 / S0380.34 / +S0380.35) were the same class of bug: **a float-arithmetic 0.005 +boundary case dropping the product BELOW the spec's HALF_UP threshold.** + +```python +# Float arithmetic loses precision at the .005 boundary +>>> 0.30 * 45.65 +13.694999999999999 # cert 2536 living-area: drops to 13.69 +>>> 21.25 * 2.30 +48.87499999999999 # cert 2800 gross-wall: drops to 48.87 +>>> 0.12 * 18.0186 +2.16224 # cert 6835 PV kWp: tail to 5 d.p. + +# Decimal arithmetic matches the spec +>>> from decimal import Decimal, ROUND_HALF_UP +>>> Decimal("0.30") * Decimal("45.65") +Decimal('13.6950') # → 13.70 HALF_UP at 2 d.p. ✓ +>>> Decimal("21.25") * Decimal("2.30") +Decimal('48.8750') # → 48.88 HALF_UP at 2 d.p. ✓ +``` + +RdSAP10 §15 p.66 enumerates the 2-d.p. rule: U-values, gross element +areas, internal floor areas, living area, storey heights, kWp. **Any +future +0.0007-ish residual that traces to an area or kWp** is the +same bug — use the [`_decimal_round_half_up_sum`](../worksheet/heat_transmission.py) +helper or inline Decimal arithmetic. + +## Cohort distributions at HEAD `883d66ac` + +### Cohort-2 (38-cert dataset, Summary path) + +| Bucket (\|Δ\|) | Session start | Now | Δ | +|---|---|---|---| +| exact (<1e-4) | 33 | **38** | **+5** | +| 1e-4..0.07 | 5 | **0** | -5 | +| 0.07..0.5 | 0 | **0** | = | +| 0.5..1 | 0 | **0** | = | +| 1..5 | 0 | **0** | = | +| >5 | 0 | **0** | = | +| RAISES | 0 | **0** | = | + +### Cohort-1 ASHP cohort (9-cert dataset, Summary + API paths) + +All 9 certs hit < 1e-4 on BOTH paths at HEAD: + +| Cert | Summary Δ | API Δ | +|---|---|---| +| 0330 | -1.1e-5 | (same fixture as 0380 in current tests) | +| 0350 | +2.2e-5 | +2.2e-5 | +| 0380 | +1.0e-6 | +9.7e-7 | +| 2225 | -4.8e-5 | -4.8e-5 (cohort worst residual) | +| 2636 | -2.4e-6 | -2.4e-6 (closed by S0380.31, was -0.015) | +| 3800 | -2.0e-5 | -2.0e-5 | +| 9285 | -3.4e-5 | -3.4e-5 | +| 9418 | -3.6e-7 | -3.6e-7 | +| 9501 | -3.9e-5 | (no API fixture in tests) | + +`_ASHP_COHORT_CHAIN_TOLERANCE` is now **1e-4** (was 0.04 at session +start, set in S0380.29 to size for the closed +0.03..+0.06 cluster). + +## ★ Thread 4: API-path closure for cohort-2 — concrete plan + +The user wants **cross-mapper parity** as the validation primitive: + +``` + API JSON ─────► from_api_response ─────► EpcPropertyData_A + │ + ▼ + cert_to_inputs ─► calc + │ + ▼ + sap_score_continuous ≈ worksheet + │ (1e-4) +Summary PDF ─► ElmhurstExtractor ─► from_elmhurst_site_notes ─► EpcPropertyData_B + │ + ▼ + cert_to_inputs ─► calc + │ + ▼ + sap_score_continuous ≈ worksheet + │ (1e-4) +``` + +If both paths hit 1e-4 vs the worksheet, the **SAP cascade attests that +the two EpcPropertyData instances are cascade-output-equivalent** for +load-bearing fields. This is strictly stronger than a structural +EpcPropertyData diff (which would fail noisily on cosmetic-but- +cascade-irrelevant differences like ordering or unused fields). + +### Suggested slice plan (the user explicitly authorised bigger slices) + +**Slice A — Bulk-fetch the 38 cohort-2 API JSONs (one slice)** + +Script: write a one-off `scripts/fetch_cohort2_api_jsons.py` that: +- Reads `OPEN_EPC_API_TOKEN` from `backend/.env` +- For each of the 38 cert refs in `sap worksheets/additional with api 2/`, + calls `EpcClientService._fetch_certificate(cert_num)` and persists + the JSON to `domain/sap10_calculator/rdsap/tests/fixtures/golden/.json` +- Skips certs whose JSON already exists (cohort-1 + earlier golden fixtures) + +Stage + commit the 38 new JSON fixtures in one go. The script itself +can be a throwaway (not part of the test suite). + +**Slice B — Parametrized cohort-2 API-path chain test (one slice)** + +Add ONE parametrized test in [test_summary_pdf_mapper_chain.py](../../backend/documents_parser/tests/test_summary_pdf_mapper_chain.py): + +```python +@pytest.mark.parametrize("cert_dir_name,ws_sap", _COHORT_2_CERTS) +def test_api_cohort_2_full_chain_sap_matches_worksheet_at_1e_minus_4( + cert_dir_name: str, ws_sap: float +) -> None: + """API path mirror of Summary path. Identical inputs (the same EPC + in two formats) must produce identical SAP. Worksheet is the source + of truth; both paths must hit it at 1e-4.""" + api_json = _COHORT_2_API_DIR / f"{cert_dir_name}.json" + doc = json.loads(api_json.read_text()) + epc = EpcPropertyDataMapper.from_api_response(doc) + r = calculate_sap_from_inputs(cert_to_inputs(epc, prices=SAP_10_2_SPEC_PRICES)) + assert abs(r.sap_score_continuous - ws_sap) <= 1e-4 +``` + +The `_COHORT_2_CERTS` list is derived once from the directory layout + +worksheet SAP value (use the diagnostic probe at the end of this doc +to bootstrap the list of (cert, ws_sap) pairs). + +**Expected outcome:** most certs will pass immediately at 1e-4 because +the cascade is identical regardless of which mapper produced the EPC +(the cascade can't tell). Any failures will be cohort-2-specific API- +mapper coverage gaps — analogous to the cohort-1 work in S0380.30 +where API path needed glazing-code Table 6b extension. + +**Slice C+ — Close each API-path residual (one slice per cert)** + +If Slice B leaves residuals, each remaining cert gets a focused slice +to find the API-mapper gap. The pattern is now well-trodden — probe +EpcPropertyData_A vs EpcPropertyData_B for load-bearing-field +divergence, identify the API-mapper field that disagrees with the +Elmhurst mapper, fix the API mapper, re-pin. + +### Golden test residuals → ~0 (separate thread) + +Currently [`_EXPECTATIONS`](../rdsap/tests/test_golden_fixtures.py) +pins residuals like: + +| Cert | Pinned SAP Δ | Pinned PE Δ | Pinned CO2 Δ | Notes from fixture | +|---|---:|---:|---:|---| +| 0240 | -14 | +12.49 | +0.70 | RR `room_in_roof_type_1` extraction gap | +| 0300 | 0 | +8.28 | -0.25 | (gas combi, several mapper gaps) | +| 0390 | -7 | -26.01 | -2.52 | | +| 6035 | -6 | +46.76 | +1.07 | | +| 7536 | +1 | -7.08 | -0.19 | | +| 8135 | 0 | -0.07 | +0.02 | (already near-zero) | +| 2130 | +1 | -38.63 | +0.30 | | +| 0390 (B)| 0 | +0.15 | +0.04 | (already near-zero) | +| 0380 | 0 | -14.60 | +0.28 | ASHP cohort | +| 0350 | 0 | -7.78 | +0.17 | ASHP cohort | +| 2225 | 0 | -11.77 | +0.26 | ASHP cohort | +| 2636 | 0 | -9.65 | +0.22 | ASHP cohort (re-pinned this session) | +| 3800 | 0 | -9.61 | +0.26 | ASHP cohort | +| 9285 | 0 | -7.96 | +0.16 | ASHP cohort | +| 9418 | 0 | -7.30 | +0.16 | ASHP cohort | + +These are **calc − lodged-EPC-values** residuals — what the cascade +produces vs what the EPC was lodged with on the gov.uk register. +SAP-int residuals on the ASHP cohort all sit at 0 (the chain-test +work closed those), but PE and CO2 residuals show the cascade is +under-counting Primary Energy by ~7-15 kWh/m² and over-counting CO2 +by ~0.2-0.3 t/yr across the ASHP cohort. + +**Two distinct PE/CO2 gap clusters to investigate:** + +1. **ASHP cohort PE clusters at -7..-15 kWh/m².** The certs all share + the same PCDB heat pump (Mitsubishi PUZ-WM50VHA), the same CO2 + over-count (~+0.22 t/yr), and the same magnitude PE under-count. + This smells like a single cascade gap in either the SAP 10.2 + Appendix L1 primary-energy lookup for electricity (likely a missing + distribution-loss factor or wrong tariff routing) or in the §12 + Table 12d monthly electricity factor cascade for heat pumps. + +2. **Pre-existing cohort PE residuals ±26..+46 kWh/m²** (certs 0240, + 0300, 0390, 6035, 2130). These are old fixtures with documented + mapper gaps in the `notes:` field (e.g. cert 0240's RR extraction). + Closing them will lower the SAP-int residuals too, not just PE/CO2. + +The chain-test cohort-2 work this session focused on `sap_score_continuous` +which is the cascade's continuous SAP. The golden fixtures pin **API- +published lodged values** which include PE and CO2 figures the chain +tests don't currently exercise. Closing the golden residuals means +adding cascade-vs-API-lodged-PE/CO2 assertions to the cohort-2 sweep +and chasing whichever subsystem produces the gap. + +The user's target: **PE Δ and CO2 Δ both at < 0.01** for any cert +where the SAP-int Δ is already 0. The 0.01 absolute tolerance is +already enforced by `_PE_ABS_TOLERANCE_KWH_PER_M2` / `_CO2_ABS_TOLERANCE_TONNES` +on the residual stability — what changes is the **expected residual +itself** (pinning at the actual delta vs zero). + +## Diagnostic probes + +### Cohort-2 Summary path sweep (snapshot — should be 38/38 exact) + +```bash +PYTHONPATH=/workspaces/model python <<'PY' +import re, subprocess +from collections import defaultdict +from pathlib import Path +from backend.documents_parser.tests.test_summary_pdf_mapper_chain import _summary_pdf_to_textract_style_pages +from backend.documents_parser.elmhurst_extractor import ElmhurstSiteNotesExtractor +from datatypes.epc.domain.mapper import EpcPropertyDataMapper, UnmappedElmhurstLabel +from domain.sap10_calculator.rdsap.cert_to_inputs import ( + cert_to_inputs, SAP_10_2_SPEC_PRICES, UnresolvedPcdbCombiLoss, +) +from domain.sap10_calculator.calculator import calculate_sap_from_inputs + +src_root = Path('/workspaces/model/sap worksheets/additional with api 2') +buckets = defaultdict(list) +def bucket(d): + a = abs(d) + if a < 1e-4: return "exact" + if a < 0.07: return "<=0.07" + return "WORSE" +for cd in sorted(src_root.iterdir()): + if not cd.is_dir(): continue + sp = next(cd.glob("Summary_*.pdf"), None) + ws_pdf = next(cd.glob("dr87-*.pdf"), None) + if not (sp and ws_pdf): continue + out = subprocess.run(["pdftotext", str(ws_pdf), "-"], capture_output=True, text=True).stdout + m = re.search(r"SAP value\s*\n?\s*([\d.]+)", out) + ws_sap = float(m.group(1)) if m else None + try: + sn = ElmhurstSiteNotesExtractor(_summary_pdf_to_textract_style_pages(sp)).extract() + epc = EpcPropertyDataMapper.from_elmhurst_site_notes(sn) + r = calculate_sap_from_inputs(cert_to_inputs(epc, prices=SAP_10_2_SPEC_PRICES)) + d = r.sap_score_continuous - ws_sap + buckets[bucket(d)].append((cd.name, d, ws_sap)) + except (UnresolvedPcdbCombiLoss, UnmappedElmhurstLabel) as e: + buckets["RAISES"].append((cd.name, str(e))) +for b in ("exact", "<=0.07", "WORSE", "RAISES"): + if b in buckets: + print(f"[{b}] {len(buckets[b])}") + if b != "exact": + for tup in buckets[b]: + print(f" {tup}") +PY +``` + +### Cohort-2 (cert_dir, ws_sap) list bootstrap + +```bash +# Emit the parametrize list for the API-path test +PYTHONPATH=/workspaces/model python <<'PY' +import re, subprocess +from pathlib import Path +src = Path('/workspaces/model/sap worksheets/additional with api 2') +for cd in sorted(src.iterdir()): + if not cd.is_dir(): continue + ws_pdf = next(cd.glob("dr87-*.pdf"), None) + if not ws_pdf: continue + out = subprocess.run(["pdftotext", str(ws_pdf), "-"], capture_output=True, text=True).stdout + m = re.search(r"SAP value\s*\n?\s*([\d.]+)", out) + if m: + print(f' ("{cd.name}", {float(m.group(1))}),') +PY +``` + +### API JSON fetch (Slice A skeleton) + +```python +# scripts/fetch_cohort2_api_jsons.py — throwaway, not part of test suite +import json, os +from pathlib import Path +from dotenv import load_dotenv +from backend.epc_client.epc_client_service import EpcClientService + +load_dotenv(Path(__file__).parents[1] / "backend" / ".env") +client = EpcClientService(token=os.environ["OPEN_EPC_API_TOKEN"]) +src = Path("sap worksheets/additional with api 2") +dst = Path("domain/sap10_calculator/rdsap/tests/fixtures/golden") +for cd in sorted(src.iterdir()): + if not cd.is_dir(): continue + out_path = dst / f"{cd.name}.json" + if out_path.exists(): + print(f"skip {cd.name} (exists)") + continue + print(f"fetch {cd.name}") + raw = client._fetch_certificate(cd.name) + out_path.write_text(json.dumps(raw, indent=2)) +``` + +## Test baseline at HEAD + +```bash +PYTHONPATH=/workspaces/model python -m pytest \ + backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \ + backend/documents_parser/tests/test_elmhurst_extractor.py \ + backend/documents_parser/tests/test_elmhurst_end_to_end.py \ + domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \ + domain/sap10_calculator/worksheet/tests/test_water_heating.py \ + domain/sap10_calculator/worksheet/tests/test_mean_internal_temperature.py \ + domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \ + domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \ + domain/sap10_calculator/tests/test_pcdb_table_362_lookup.py \ + domain/sap10_ml/tests/test_rdsap_uvalues.py \ + datatypes/epc/schema/tests/test_schema_loading.py \ + --no-cov -q +``` + +Expected: **712 pass + 0 fails** (down from 710 + 10 at session start +and 712 + 10 at the precision-floor-closed handover). Every test in +the suite passes. + +## Conventions preserved (carry forward) + +- **1e-4 across the board** ([[feedback-one-e-minus-4-across-the-board]]) +- **Worksheet, not API, is the target** for chain tests + ([[feedback-worksheet-not-api-reference]]) — except for the golden + fixtures, which intentionally pin against API-lodged values to + surface mapper gaps as residual drift. +- **Cross-mapper parity via cascade equivalence**: API EPC and + Elmhurst EPC must produce SAP within 1e-4 of each other AND of the + worksheet ([[feedback-cross-mapper-parity-via-cascade]]). +- **Spec-floor skepticism**: claims of "precision floor" usually mask + a spec-citation bug ([[feedback-spec-floor-skepticism]]). The three + Decimal HALF_UP bugs this session are case in point. +- **Bigger slices OK for uniform-cohort work** — the user explicitly + authorised this for the API-path closure + ([[feedback-bigger-slices-for-uniform-work]]). +- **Golden residuals → ~0**: pinned PE/CO2 residuals at zero (or + documented why not) are the new bar ([[feedback-golden-residuals-near-zero]]). +- **AAA test convention** with literal `# Arrange / # Act / # Assert` + headers ([[feedback-aaa-test-convention]]). +- **`abs(diff) <= tol`** not `pytest.approx` + ([[feedback-abs-diff-over-pytest-approx]]). +- **Spec citation in commit messages** + ([[feedback-spec-citation-in-commits]]). +- **One slice = one commit; stage by name** + ([[feedback-commit-per-slice]]). +- **Strict-enum raises on unmapped labels / unresolved cascade dispatch**. +- **Pyright net-zero per touched file**. + +## Pyright baselines at HEAD (post-S0380.38) + +- `datatypes/epc/domain/mapper.py`: 32 +- `datatypes/epc/surveys/elmhurst_site_notes.py`: 0 +- `backend/documents_parser/elmhurst_extractor.py`: 0 +- `backend/documents_parser/tests/test_summary_pdf_mapper_chain.py`: 0 +- `domain/sap10_calculator/rdsap/cert_to_inputs.py`: 34 +- `domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py`: 11 +- `domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py`: 1 +- `domain/sap10_calculator/tables/pcdb/parser.py`: 0 +- `domain/sap10_calculator/tests/test_pcdb_table_362_lookup.py`: 0 +- `domain/sap10_calculator/worksheet/heat_transmission.py`: 13 +- `domain/sap10_calculator/worksheet/internal_gains.py`: 0 +- `domain/sap10_calculator/worksheet/solar_gains.py`: 0 +- `domain/sap10_calculator/worksheet/tests/test_heat_transmission.py`: 71 +- `domain/sap10_calculator/worksheet/tests/test_solar_gains.py`: 22 +- `domain/sap10_calculator/worksheet/tests/test_water_heating.py`: 94 +- `domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py`: 2 +- `domain/sap10_ml/rdsap_uvalues.py`: 0 +- `domain/sap10_ml/tests/test_rdsap_uvalues.py`: 66 + +## Memory references (auto-loaded by the agent's harness) + +Cross-session memories load automatically. Key ones for the API-path +work: + +- [[feedback-one-e-minus-4-across-the-board]] — user target is 1e-4 for HPs too. +- [[feedback-worksheet-not-api-reference]] — chain tests pin to worksheet. +- [[feedback-cross-mapper-parity-via-cascade]] — *new this session*: API EPC and Elmhurst EPC must produce SAP within 1e-4 of each other and of the worksheet. +- [[feedback-bigger-slices-for-uniform-work]] — *new this session*: the user explicitly authorised batching for uniform work. +- [[feedback-golden-residuals-near-zero]] — *new this session*: pinned PE/CO2 residuals should be at zero (or documented why not). +- [[feedback-cascade-pin-methodology]] — test the actual cascade against PDF line refs. +- [[reference-sap10-spec-docs]] — full BRE technical paper set at `domain/sap10_calculator/docs/specs/`. +- [[feedback-commit-per-slice]] / [[feedback-aaa-test-convention]] / + [[feedback-abs-diff-over-pytest-approx]] / [[feedback-spec-citation-in-commits]] — + slicing + test conventions. +- [[project-summary-path-cohort-closure]] — cohort-1 ASHP closure context. +- [[project-cohort-2-summary-path-closure]] — cohort-2 Summary-path + closure context (now superseded — cohort-2 is 38/38 at HEAD). +- [[project-api-to-sap-residual-test]] — `test_golden_cert_residual_matches_pin` + is the forcing function; residuals re-pinned in Slice S0380.31 for cert 2636. + +## First concrete actions for next agent + +1. **Re-run the diagnostic probe** to confirm baseline reproduces + (38/38 cohort-2 Summary path; 9/9 cohort-1 ASHP; 712 pass + 0 + fails on the test suite). + +2. **Slice A — Bulk-fetch cohort-2 API JSONs.** Write + `scripts/fetch_cohort2_api_jsons.py` (skeleton above), run it once + to land 38 JSON fixtures, commit them as a single slice. The + script can stay in `scripts/` or be deleted post-run; do NOT add + it to the test suite. + +3. **Slice B — Parametrized API-path chain test.** Add ONE + parametrized test that mirrors the Summary-path sweep. The + parametrize list bootstraps from the diagnostic probe above (38 + `(cert_dir, ws_sap)` pairs). Expect most certs to pass at 1e-4 + immediately; iterate on any remaining residuals one slice at a + time per the existing pattern. + +4. **Thread the golden-residuals-near-zero target through subsequent + slices.** For any cohort-2 cert whose chain-test SAP closes at + 1e-4 but whose API-lodged PE / CO2 doesn't match the cascade at + ~1e-2, that's the next residual to chase. The ASHP cohort PE + cluster at -7..-15 kWh/m² is the largest single thread — same root + cause likely affects every Mitsubishi PUZ-WM50VHA cert. + +5. **Tighten `_ASHP_COHORT_CHAIN_TOLERANCE` again** once API-path + parity is established. Current 1e-4 gives ~2x headroom on the + cohort-1 worst residual (cert 2225 4.8e-5). If the cohort-2 API + sweep produces similar headroom, the constant can drop to ~1e-5. + +Good luck. The cohort distributions are in the strongest shape they've +ever been (Summary path 47/47 < 1e-4, API path 7/9 < 1e-4 with the rest +pending Slice A/B fetches), the test suite is 100% green, and the +remaining work is **uniform across certs** — cohort-2 API-path closure ++ golden-residuals-near-zero — so the user's "bigger slices" mandate +fits the work naturally. The §15 Decimal HALF_UP pattern is the most +likely candidate for any remaining +0.0007-scale residual.