scripts?

2026-06-30 13:10:47 +00:00 · 2026-06-11 07:07:27 +00:00 · 2026-06-11 07:07:27 +00:00 · 362cd20f11
commit 362cd20f11
parent 8ad560dc48
3 changed files with 203 additions and 9 deletions
--- a/docs/grill-sessions/2026-06-10-pre-sap10-mapper-generalization.md
+++ b/docs/grill-sessions/2026-06-10-pre-sap10-mapper-generalization.md
@ -0,0 +1,181 @@
+# Grill spec — generalise Reduced-Field Synthesis to the rest of the pre-SAP10 RdSAP family
+
+**Date:** 2026-06-10  ·  **Branch:** `feature/junte+khalim`  ·  **Status:** SPEC — READY TO GRILL.
+
+Grill this by running `/grill-me` and feeding it this file. Start at **Q1 (ROOT)**.
+
+---
+
+## Why this exists
+
+The RdSAP **20.0.0** mapper now works end-to-end: all 1000 corpus certs parse, map via
+**Reduced-Field Synthesis** (ADR-0027), and score through `Sap10Calculator` without crashing.
+`scripts/eon/find_epc_data.py` shows lodged-vs-our-calculated SAP side by side and the deltas are
+sane (mostly ±7, same band). The pattern is proven.
+
+The goal now: **apply the same playbook to the other pre-SAP10 RdSAP specs** so historical EPC data
+across more lodgement years can be Rebaselined. This is pure leverage — the hard design thinking
+(synthesis coefficients, Validation-Cohort rule, schema-fix mechanism) is already done; what remains
+is per-spec drift and a decision about how much to share vs copy.
+
+## What we already hold (the repeatable 20.0.0 playbook)
+
+Each step below is *proven* for 20.0.0. The grill is about which steps change per spec.
+
+1. **Harvest a corpus** — `scripts/eon/harvest_certs.py` streams a local bulk dump
+   (`downloads/certificates-YYYY.json`) for the year that spec dominates, caps at 1000, writes
+   `backend/epc_api/json_samples/<schema>/corpus.jsonl`. No API token needed.
+2. **Fix the placeholder schema** — every `rdsap_schema_*.py` was generated from ONE example so it
+   over-constrains. Make it `@dataclass(kw_only=True)` + data-driven required→optional (any field
+   present in <100% of the corpus gets a default; `[]` for lists, `None` otherwise) → all certs parse.
+3. **Synthesise the measured fields** the reduced schema only records categorically:
+   windows (`glazed_area` band × floor area, 4-way N/E/S/W split), lighting LEL, hot-water bath/mixer
+   counts, ventilation/chimneys/sheltered-sides, glazing cascade.
+4. **Leave calculator defaults to the calculator** — `cert_to_inputs` is the RdSAP Table-5 expansion
+   engine; the mapper supplies raw reduced data only.
+5. **Wire dispatch + flip a strict guard** — add the `schema_type` branch to
+   `from_api_response`, promote the corpus into the strict parse+map bucket in
+   `infrastructure/epc_client/tests/test_mapper_corpus.py`.
+6. **Record every synthesis assumption in code comments + test names** (Validation-Cohort rule: no
+   same-spec ground truth).
+
+## Ground truth about the targets (verified 2026-06-10)
+
+| Spec | Schema module | Mapper method | Dispatched? | Corpus? | Notes |
+|------|---------------|---------------|-------------|---------|-------|
+| 21.0.1 | ✅ | `from_rdsap_schema_21_0_1` | ✅ | ✅ 1000 | reference (rich, measured) |
+| 21.0.0 | ✅ | `from_rdsap_schema_21_0_0` | ✅ | ❌ | dispatched but unguarded |
+| **20.0.0** | ✅ | `from_rdsap_schema_20_0_0` | ✅ | ✅ 1000 | **DONE — the template** |
+| **19.0** | ✅ | `from_rdsap_schema_19_0` | ❌ | ❌ | orphaned; `sap_windows=[]` hardcoded |
+| **18.0** | ✅ | `from_rdsap_schema_18_0` | ❌ | ❌ | orphaned |
+| **17.1** | ✅ | `from_rdsap_schema_17_1` | ❌ | ❌ | orphaned |
+| **17.0** | ✅ | `from_rdsap_schema_17_0` | ❌ | ❌ | orphaned |
+
+- 19.0 confirmed same reduced-field shape as 20.0.0: `glazed_area: int` band + `multiple_glazing_type:
+  int`, and the mapper currently hardcodes `sap_windows=[]` — i.e. the exact windowless-corruption bug
+  that 20.0.0's synthesis fixed. 18.0/17.1/17.0 are almost certainly the same family.
+- The 17–19 mapper methods **exist** but are unreachable: `from_api_response` only branches 21.0.1 /
+  21.0.0 / 20.0.0; everything else hits `raise ValueError(f"Unsupported EPC schema")`.
+- **Corpora are harvestable.** `downloads/README.txt` schema-by-year:
+  `2020 → RdSAP-Schema-19.0 (1632)`, `2021–2024 → 20.0.0`, `2025–2026 → 21.0.1`. Older RdSAP (17.x/18.0)
+  live in the 2012–2019 dumps (all present locally). `SAP-Schema-1x` (full/design SAP) and `CEPC-*`
+  (commercial) are different families with no RdSAP mapper.
+
+---
+
+## Decision tree to grill (each has a recommended answer)
+
+### Q1 (ROOT) — Target set and order. What are we generalising to, and in what order?
+**Recommend:** the **pre-SAP10 RdSAP family only**, one spec at a time, **19.0 first** (dominant in the
+2020 dump, closest sibling to 20.0.0, mapper already stubbed), then 18.0 → 17.1 → 17.0 as their dumps
+confirm volume. **Exclude** `SAP-Schema-1x` (full/design SAP — new-build, not reduced; a separate
+mapper family and ADR) and `CEPC-*` (non-domestic). **Carve out** 21.0.0 as a quick win: it's already
+dispatched, it just needs a harvested corpus to join the strict guard.
+*Sub-question:* do we batch all four 17–19 in one branch sweep, or land 19.0 fully (corpus → schema →
+synthesis → dispatch → guard) before starting 18.0? Recommend: **land 19.0 end-to-end first** — it
+either confirms the playbook transfers cleanly (then 18.0/17.x are fast) or surfaces drift early.
+
+### Q2 — Coefficient reuse vs re-fit (the load-bearing, ADR-worthy one).
+20.0.0's glazing synthesis uses `0.148 × TFA × band_multiplier`, fit from the **21.0.1** corpus's
+glazing/floor ratio. For 19.0/18.0/17.x: reuse the same coefficients, or re-fit per spec?
+**DIRECTION (user, 2026-06-10): re-work the coefficients from each new corpus's own data — do not
+inherit the 21.0.1 fit by default.** Treat `0.148` + the band multipliers as a *starting hypothesis* to
+confirm or replace against what the new corpus actually shows, per spec. The empirical numbers lead; we
+only keep the 20.0.0 values if the new corpus reproduces them.
+
+**The constraint this hits (must resolve while grilling):** a reduced schema does **not** measure
+per-window area (that's the whole reason synthesis exists), so a 19.0/18.0/17.x corpus *cannot
+self-fit the glazing/floor ratio* — there's no measured glazing column in it to regress on. So
+"work it out from the new corpus" splits into two parts:
+- **What the reduced corpus *can* give us directly** → re-derive per spec: the `glazed_area` band
+  *distribution* (how many Normal/More/Less), `total_floor_area` distribution, and whether the band
+  codes/semantics match 20.0.0. This validates (or breaks) the band-multiplier assumption empirically.
+- **The base ratio itself (the `0.148`)** → needs a *measured* reference from the same stock/era.
+  Options to grill: (a) use the contemporaneous measured corpus if one exists for that year (e.g. a
+  rich-window spec lodged alongside), (b) fit from the handful of rich certs the reduced corpus *does*
+  carry (20.0.0 had 7/1000 with lodged `sap_windows` — check the count per spec), or (c) fall back to
+  the 21.0.1 fit *only* if (a)/(b) yield too little signal, and say so explicitly.
+
+This moves every rebaselined score for the spec, so the per-spec fit + its evidence wants an ADR
+(extends ADR-0027). Record the derivation (corpus, sample size, quartiles) the same way 20.0.0 did.
+
+### Q3 — Code-space drift across versions.
+Do 17–19 use the same integer code spaces as 20.0.0 (glazing_type, built_form, orientation, fuel,
+heat-emitter, party-wall, roof/floor construction)? 20.0.0's codes turned out **identical** to 21.0.1's,
+so we routed through the existing cascades verbatim. **Recommend:** assume identical within the RdSAP
+family; cross-check each version against `epc_codes.csv` during grilling and add a per-spec cascade
+override *only* where the corpus proves a code diverged. Don't pre-build translation layers.
+
+### Q4 — Schema-fix mechanism. Same `kw_only` + data-driven required→optional?
+**Recommend: yes, unchanged.** Each placeholder schema over-constrains identically (single-example
+generation). Run the one-pass corpus scan to enumerate all missing-required fields at once (not
+whack-a-mole), then default them. Mechanical, low-risk, proven.
+
+### Q5 — Shared synthesis helper vs per-mapper copy (the architecture fork).
+20.0.0's synthesis lives in `_synthesise_20_0_0_sap_windows` + inline mapper blocks. With 19.0 we'll
+have a **second instance** — the classic extract trigger. **Recommend:** once 19.0 is green, extract a
+single spec-parameterised `_synthesise_reduced_field_windows(glazed_area, tfa, glazing_type)` (and
+shared lighting/hot-water/ventilation helpers) so 18.0/17.x are near-free and the coefficients live in
+exactly one place. Defer the extraction until 19.0 confirms the shape (avoid abstracting from one
+example). This is the `/improve-codebase-architecture` hook — a deep module behind a small interface.
+
+### Q6 — Per-spec field availability.
+Do 17–19 actually lodge the synthesis *inputs* 20.0.0 relies on — `instantaneous_wwhrs` (bath room
+counts), `low_energy_fixed_lighting_outlets_count`, `percent_draughtproofed`, `open_fireplaces_count`,
+`multiple_glazing_type`? Older specs may omit or rename some. **Recommend:** profile each corpus up
+front (one-pass field-presence scan); where a 20.0.0 input is absent, degrade gracefully to the
+calculator's own default rather than fabricating — and record the gap in a test name.
+
+### Q7 — Dispatch wiring + acceptance bar.
+**Recommend:** per spec, add the `schema_type` branch to `from_api_response` (wrapped in
+`_clear_basement_flag_when_system_built` like the others) and promote its corpus into the strict
+parse+map bucket in `test_mapper_corpus.py`. Smoke-check with `scripts/eon/find_epc_data.py` (extend
+the UPRN list with that spec's certs) — our re-score should track the lodged figure within a sane band.
+The formal SAP-score *value* test stays deferred (same as 20.0.0) until we choose to land it.
+
+### Q8 — Validation-Cohort / is there ANY cross-check?
+Same rule as 20.0.0: a pre-SAP10 cert has no same-spec lodged figure to validate against, so synthesis
+assumptions go in code/test names. **But probe one opportunistic anchor:** a single UPRN re-lodged
+across spec versions (e.g. a dwelling with both a 19.0 and a 20.0.0 cert, unchanged between) — our
+re-score of both should roughly agree. **Recommend:** if dual-lodged UPRNs surface during harvest, keep
+a handful as a cross-spec regression anchor; don't block on it.
+
+---
+
+## How to reproduce / kick off (19.0 first)
+
+```bash
+# 1. Confirm 19.0 volume + reduced-field shape in the 2020 dump
+python - <<'EOF'
+import json
+from collections import Counter
+# stream the first N lines of certificates-2020.json, bucket by schema_type,
+# and dump one RdSAP-Schema-19.0 document to inspect glazed_area / sap_windows
+EOF
+
+# 2. Add a harvest source row and run it
+#    scripts/eon/harvest_certs.py SOURCES += ("certificates-2020.json","RdSAP-Schema-19.0",1000)
+
+# 3. Drive the (orphaned) 19.0 mapper against the new corpus to bucket parse failures
+python - <<'EOF'
+import json, collections
+from pathlib import Path
+from datatypes.epc.schema.rdsap_schema_19_0 import RdSapSchema19_0
+from datatypes.epc.schema.helpers import from_dict
+certs=[json.loads(l) for l in Path("backend/epc_api/json_samples/RdSAP-Schema-19.0/corpus.jsonl").read_text().splitlines() if l.strip()]
+b=collections.Counter()
+for c in certs:
+    try: from_dict(RdSapSchema19_0, c)
+    except Exception as e: b[f"{type(e).__name__}: {str(e)[:70]}"]+=1
+for k,v in b.most_common(): print(v,k)
+EOF
+```
+
+## References
+
+- **ADR-0027** (`docs/adr/0027-rdsap-20-0-0-reduced-field-synthesis.md`) — the synthesis decision,
+  coefficients, rejected alternatives. Extend (not replace) for the family-wide coefficient choice (Q2).
+- **ADR-0015** (mappers own cert normalization), **ADR-0004** (lodged-vs-effective pair).
+- **CONTEXT.md** — _Reduced-Field Synthesis_, _Rebaselining_, _Lodged / Effective Performance_,
+  _Validation Cohort_, _pre-SAP10_.
+- **20.0.0 resume doc** — `docs/grill-sessions/2026-06-09-rdsap-20-0-0-remapper.md` (the worked example).
--- a/next_claude_prompt.txt
+++ b/next_claude_prompt.txt
@ -0,0 +1 @@
+/grill-me docs/grill-sessions/2026-06-10-pre-sap10-mapper-generalization.md
--- a/scripts/run_modelling_e2e.py
+++ b/scripts/run_modelling_e2e.py
@ -124,16 +124,16 @@ def _s3_parquet_reader(bucket: str) -> ParquetReader:
    return read


-def _spatial_for(
-    repo: GeospatialS3Repository, uprn: int
-) -> Optional[SpatialReference]:
+def _spatial_for(repo: GeospatialS3Repository, uprn: int) -> Optional[SpatialReference]:
    """The UPRN's spatial reference (coordinates + planning protections), or
    None when S3 doesn't cover it — a missing reference must not abort the run,
    so a lookup error degrades to None (unrestricted, no solar)."""
    try:
        return repo.spatial_for(uprn)
    except Exception as error:  # noqa: BLE001 — S3/parquet hiccup is non-fatal
-        print(f"  spatial lookup failed for uprn {uprn}: {type(error).__name__}: {error}")
+        print(
+            f"  spatial lookup failed for uprn {uprn}: {type(error).__name__}: {error}"
+        )
        return None


@ -186,7 +186,9 @@ def _parse_measures(raw: Optional[str]) -> Optional[frozenset[MeasureType]]:
    (consider every modelled measure) when unset. Raises on an unknown type."""
    if raw is None:
        return None
-    return frozenset(MeasureType(token.strip()) for token in raw.split(",") if token.strip())
+    return frozenset(
+        MeasureType(token.strip()) for token in raw.split(",") if token.strip()
+    )


 def _context_summary(
@ -252,8 +254,12 @@ def _persist(

 def main() -> None:
    parser = argparse.ArgumentParser(description=__doc__)
-    parser.add_argument("property_ids", type=int, nargs="+", help="Property ids to model")
-    parser.add_argument("--goal", default="C", help="target band when no --scenario-id (default C)")
+    parser.add_argument(
+        "property_ids", type=int, nargs="+", help="Property ids to model"
+    )
+    parser.add_argument(
+        "--goal", default="C", help="target band when no --scenario-id (default C)"
+    )
    parser.add_argument(
        "--scenario-id", type=int, default=None, help="model against this DB Scenario"
    )
@ -263,12 +269,16 @@ def main() -> None:
        help="comma-separated measure types to consider (default: all)",
    )
    parser.add_argument(
-        "--portfolio-id", type=int, default=None, help="portfolio id (required for --persist)"
+        "--portfolio-id",
+        type=int,
+        default=None,
+        help="portfolio id (required for --persist)",
    )
    parser.add_argument(
        "--persist",
        action="store_true",
        help="WRITE the inputs + Plan to the DB (default: inspect only, no writes)",
+        default=False,
    )
    parser.add_argument(
        "--no-solar",
@ -355,7 +365,9 @@ def main() -> None:
                    solar_insights=solar_insights,
                    plan=plan,
                )
-        except Exception as error:  # noqa: BLE001 — one bad property must not stop the run
+        except (
+            Exception
+        ) as error:  # noqa: BLE001 — one bad property must not stop the run
            line = f"property {property_id} (uprn {uprn}): ERROR — {type(error).__name__}: {error}"
            print(line + "\n")
            md_lines.append(f"## Property {property_id}\n\n`{line}`\n")
				`@ -0,0 +1 @@`
				`/grill-me docs/grill-sessions/2026-06-10-pre-sap10-mapper-generalization.md`