From 743f77d54c3f4677d660249bca0149cabb2d7436 Mon Sep 17 00:00:00 2001
From: Khalim Conn-Kowlessar <kconnkowlessar@gmail.com>
Date: Mon, 18 May 2026 19:03:40 +0000
Subject: [PATCH] docs: handover for fresh-context SAP calculator review
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Per user suggestion: the iteration history in this chat has likely
accreted blind spots that a long context window can't shed (e.g. I
spent slices comparing our delivered kWh to the cert's primary kWh
without noticing the apples-to-oranges error). A fresh agent reading
the SAP 10.2 + RdSAP 10 PDFs cold against the current calculator may
spot gaps faster.

HANDOVER_FRESH_REVIEW.md gives the fresh agent:
- Current state (MAE 5.34, primary-energy bias +51 kWh/m²)
- Repo layout pointer
- Priority-ordered dig list (PEUI mystery first)
- Validated truths
- Dead-end list (don't repeat S-B5 NI thickness switch etc.)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
---
 docs/sap-spec/HANDOVER_FRESH_REVIEW.md | 136 +++++++++++++++++++++++++
 1 file changed, 136 insertions(+)
 create mode 100644 docs/sap-spec/HANDOVER_FRESH_REVIEW.md

diff --git a/docs/sap-spec/HANDOVER_FRESH_REVIEW.md b/docs/sap-spec/HANDOVER_FRESH_REVIEW.md
new file mode 100644
index 00000000..6f9ca0b5
--- /dev/null
+++ b/docs/sap-spec/HANDOVER_FRESH_REVIEW.md
@@ -0,0 +1,136 @@
+# Handover: fresh-context review of the SAP 10.2 calculator
+
+Audience: a fresh agent in a new context window. Read this first, then the SAP 10.2 + RdSAP 10 spec PDFs, then the calculator code. Your job is to find spec-vs-implementation gaps that the previous (long-context) agent has missed or got wrong.
+
+## TL;DR — where we are
+
+- Deterministic SAP 10.2 calculator at `packages/domain/src/domain/sap/`.
+- 22 slices shipped under ADR-0009.
+- 300-cert parity probe: **SAP MAE 5.34, bias +0.29** (we're slightly over-predicting SAP score on average).
+- **Primary-energy bias +51.6 kWh/m²** ← biggest surprise; we over-predict primary energy by ~50%. This was discovered just before this handover; previous slices weren't accounting for it correctly.
+- 17/300 (5.7%) certs match the cert's `energy_rating_current` exactly.
+
+Goal per ADR-0009: typical-subset SAP MAE ≤ 1.0.
+
+## Critical context
+
+1. **Two truth-sources collide.** `tables/table_12.py` carries the spec-correct SAP 10.2/10.3 prices (mains gas 3.64p, std elec 16.49p). `tables/table_12_cert_calibration.py` carries the empirical lower prices that match the cert assessor's actual output (3.48p, 13.19p). The parity probe uses the cert-calibration table; the engine's default is spec.
+2. **The cert assessor diverges from the published SAP 10.2 spec in several places** we've found:
+   - Unit prices: cert uses ~10-25% lower than published Table 12
+   - Tariff routing: cert applies off-peak to electric room heaters (code 691) when meter_type=1 (Dual), even when Table 12a says these should bill at the high rate
+   - Unknown meter (RdSAP energy_tariff=3): cert defaults to Single (per Elmhurst test), our code also matches this
+3. **PEUI bias was discovered right at handover time.** Our `primary_energy_kwh_per_m2` runs +51 kWh/m² over the cert's `energy_consumption_current`. This is the biggest clue and the most efficient next dig.
+
+## Repo layout
+
+```
+packages/domain/src/domain/sap/
+├── calculator.py                    # Sap10Calculator + calculate_sap_from_inputs
+├── tables/
+│   ├── table_12.py                  # SAP 10.2 spec prices, CO2, PEF
+│   └── table_12_cert_calibration.py # empirical cert prices
+├── worksheet/
+│   ├── dimensions.py                # §1
+│   ├── ventilation.py               # §2 (incl wind shelter S-B21)
+│   ├── heat_transmission.py         # §3 (incl DwellingExposure)
+│   ├── internal_gains.py            # §5 + Appendix L
+│   ├── solar_gains.py               # §6 + Appendix U §U3.2
+│   ├── utilisation_factor.py        # Table 9a
+│   ├── mean_internal_temperature.py # §7 + Table 9/9b/9c
+│   ├── space_heating.py             # §9
+│   └── rating.py                    # §13 (SAP rating equations)
+├── climate/
+│   └── appendix_u.py                # Tables U1/U2/U3 + solar declination
+├── rdsap/
+│   └── cert_to_inputs.py            # EpcPropertyData → CalculatorInputs mapping
+├── validation/
+│   └── parity_report.py             # ParityReport aggregator
+└── tests/                           # 103 unit tests
+
+services/ml_training_data/src/ml_training_data/
+└── sap_parity_probe.py              # runs calculator on N random certs from corpus
+
+docs/sap-spec/
+├── sap-10-2-full-specification-2025-03-14.pdf  (199pp) — primary spec
+├── sap-10-3-full-specification-2026-01-13.pdf  (201pp) — newer spec (Table 12 identical)
+├── rdsap-10-specification-2025-06-10.pdf       (114pp) — RdSAP rules (separate from SAP)
+├── SPEC_COVERAGE.md                            — our coverage map
+└── PARITY_FINDINGS.md                          — earlier probe findings
+
+docs/adr/0009-deterministic-sap-calculator.md   — accepted ADR
+```
+
+## How to run the parity probe
+
+```bash
+python -c "
+import sys
+sys.path.insert(0, 'packages/domain/src')
+sys.path.insert(0, '.')
+sys.path.insert(0, 'services/ml_training_data/src')
+from ml_training_data.sap_parity_probe import main
+main(['300','7'])  # 300 certs, seed=7
+"
+```
+
+## Where to dig (priority-ordered, by likely MAE impact)
+
+### Tier 1 — the PEUI mystery (50% over)
+
+Our `primary_energy_kwh_per_m2` runs +51 kWh/m² over the cert's `energy_consumption_current`. Possibilities:
+
+- **Wrong primary energy factors in `tables/table_12.py PRIMARY_ENERGY_FACTOR`**. I populated this from approximate spec values; verify each one against SAP 10.2 Table 12 (page 189). Especially electricity PEF=1.501 — that's ~30% of corpus uses electricity for some end-use.
+- **HW demand over-counted.** Look at `domain.ml.demand.predicted_hot_water_kwh`. Cylinder loss + primary circuit loss may be over-stated. SAP §J + Appendix J details exact formulas. We use bucket-rounded `_STORAGE_LOSS_FACTOR` instead of interpolation.
+- **Space heating demand over-counted.** Could come from:
+  - Living-area-fraction defaults (Table 27): we use {1:0.75, 2:0.50, 3:0.30, 4:0.25, ≥5:0.21}; double-check against the RdSAP 10 PDF.
+  - Control-temperature adjustment (Table 4e): we always pass 0; spec applies ~-0.7°C in some configurations.
+  - Thermal mass parameter: we use 250 kJ/m²K always; spec varies by construction type.
+- **Lighting/pumps over-counted.** Currently using Appendix L existing-dwelling fallback (no fixed lighting). Newer dwellings should use lower lighting energy.
+
+### Tier 2 — wall U-value cascade
+
+Worst-residual certs have `wall_construction=4 (cavity)`, `wall_insulation_type=2`, `wall_insulation_thickness="NI"`. We treat as uninsulated cavity (column 0). Cert assessor may know it's insulated (the type=2 code says so). See `domain.ml.rdsap_uvalues._insulation_bucket` — when `thickness=0` AND `present=True`, spec says use 50mm row but our parser converts "NI"→0 which short-circuits to "uninsulated".
+
+I tried switching "NI"→None in S-B5 cycle but it over-corrected aggregate MAE. Worth re-trying with the new understanding (compare PRIMARY energy delta on affected certs specifically).
+
+### Tier 3 — cost-side residuals
+
+Per S-B17 hand-trace: cert 2389-4472 has correct delivered energy but our SAP is 10 points lower than the cert's. Implied cert blended unit-cost rate is lower than ours. Likely cause: cert assessor applies different rate logic in edge cases (oil + off-peak meter, electricity-and-gas mix, etc.). Worth tracing more carefully.
+
+### Tier 4 — known unimplemented spec pieces
+
+(per `SPEC_COVERAGE.md`)
+- Cooling §10 (rare)
+- FEE §11 (new-build only)
+- Per-junction thermal bridging Table R2 (ADR says defer)
+- Multi-main heating Table 11 with non-zero secondary (we have this conditionally)
+- Standing charges (Table 12 note (a))
+
+## What's been validated
+
+- §13 SAP rating equations: 108.8 − 120.5 log10(ECF) for ECF ≥ 3.5, else 100 − 16.21·ECF. Verified against SAP 10.2 PDF page 38.
+- §12.2 fuel price rule: "Other prices must not be used". We have spec-correct prices + cert-calibration prices as separate tables.
+- Appendix U: tables verbatim.
+- Appendix U rating-uses-UK-average rule: applied (S-B18).
+- Solar gains §6.1 + Appendix U §U3.2 polynomial: implemented.
+
+## Suggested first session
+
+1. **Read SAP 10.2 §§4 + Appendix J carefully** (hot water demand). Map every formula against our `domain.ml.demand.predicted_hot_water_kwh`. Note divergences. The PEUI bias is largely driven by HW + heating demand.
+2. **Read SAP 10.2 §14** (CO2 and primary energy). Compare to our `calculate_sap_from_inputs` primary_energy aggregation. Note especially: does the cert's `energy_consumption_current` use the same end-use list (space + HW + lighting + pumps/fans) or a different one?
+3. **Read RdSAP 10 §11 (Heating)**. Check our `domain.ml.sap_efficiencies.seasonal_efficiency` cascade against the RdSAP rules. Especially heat pump efficiency (we use 2.30 for category 4 fallback).
+4. Open issues in the parity-decomp data:
+   - 26 certs with correct energy but SAP MAE 4.12 → cost-side
+   - 51 kWh/m² primary-energy bias → demand-side
+
+## Don't repeat these dead-ends
+
+- ❌ Switching "NI" wall thickness to None — over-corrected in aggregate (S-B5)
+- ❌ Aggressive efficiency rescue for missing sap_main_heating_code — over-corrected (S-B5)
+- ❌ Using SAP 10.2 spec prices for parity validation — the cert assessor uses legacy lower prices despite reporting sap_version=10.2 (S-B9, S-B10)
+- ❌ Applying off-peak to electric main heating regardless of meter_type — the meter_type field is the truth (S-B15)
+- ❌ Always applying 10% secondary heating — should be conditional on cert lodging or main system being electric storage (S-B20)
+
+## Commit history
+
+The last 22 commits are S-B1..S-B22. Each commit message documents the slice's hypothesis, change, and measured impact. Worth reading 5-10 of the latest commit messages for context on what's been tried.