This commit is contained in:
Jun-te Kim 2026-06-11 07:07:27 +00:00
parent 8ad560dc48
commit 362cd20f11
3 changed files with 203 additions and 9 deletions

View file

@ -0,0 +1,181 @@
# Grill spec — generalise Reduced-Field Synthesis to the rest of the pre-SAP10 RdSAP family
**Date:** 2026-06-10 · **Branch:** `feature/junte+khalim` · **Status:** SPEC — READY TO GRILL.
Grill this by running `/grill-me` and feeding it this file. Start at **Q1 (ROOT)**.
---
## Why this exists
The RdSAP **20.0.0** mapper now works end-to-end: all 1000 corpus certs parse, map via
**Reduced-Field Synthesis** (ADR-0027), and score through `Sap10Calculator` without crashing.
`scripts/eon/find_epc_data.py` shows lodged-vs-our-calculated SAP side by side and the deltas are
sane (mostly ±7, same band). The pattern is proven.
The goal now: **apply the same playbook to the other pre-SAP10 RdSAP specs** so historical EPC data
across more lodgement years can be Rebaselined. This is pure leverage — the hard design thinking
(synthesis coefficients, Validation-Cohort rule, schema-fix mechanism) is already done; what remains
is per-spec drift and a decision about how much to share vs copy.
## What we already hold (the repeatable 20.0.0 playbook)
Each step below is *proven* for 20.0.0. The grill is about which steps change per spec.
1. **Harvest a corpus**`scripts/eon/harvest_certs.py` streams a local bulk dump
(`downloads/certificates-YYYY.json`) for the year that spec dominates, caps at 1000, writes
`backend/epc_api/json_samples/<schema>/corpus.jsonl`. No API token needed.
2. **Fix the placeholder schema** — every `rdsap_schema_*.py` was generated from ONE example so it
over-constrains. Make it `@dataclass(kw_only=True)` + data-driven required→optional (any field
present in <100% of the corpus gets a default; `[]` for lists, `None` otherwise) all certs parse.
3. **Synthesise the measured fields** the reduced schema only records categorically:
windows (`glazed_area` band × floor area, 4-way N/E/S/W split), lighting LEL, hot-water bath/mixer
counts, ventilation/chimneys/sheltered-sides, glazing cascade.
4. **Leave calculator defaults to the calculator**`cert_to_inputs` is the RdSAP Table-5 expansion
engine; the mapper supplies raw reduced data only.
5. **Wire dispatch + flip a strict guard** — add the `schema_type` branch to
`from_api_response`, promote the corpus into the strict parse+map bucket in
`infrastructure/epc_client/tests/test_mapper_corpus.py`.
6. **Record every synthesis assumption in code comments + test names** (Validation-Cohort rule: no
same-spec ground truth).
## Ground truth about the targets (verified 2026-06-10)
| Spec | Schema module | Mapper method | Dispatched? | Corpus? | Notes |
|------|---------------|---------------|-------------|---------|-------|
| 21.0.1 | ✅ | `from_rdsap_schema_21_0_1` | ✅ | ✅ 1000 | reference (rich, measured) |
| 21.0.0 | ✅ | `from_rdsap_schema_21_0_0` | ✅ | ❌ | dispatched but unguarded |
| **20.0.0** | ✅ | `from_rdsap_schema_20_0_0` | ✅ | ✅ 1000 | **DONE — the template** |
| **19.0** | ✅ | `from_rdsap_schema_19_0` | ❌ | ❌ | orphaned; `sap_windows=[]` hardcoded |
| **18.0** | ✅ | `from_rdsap_schema_18_0` | ❌ | ❌ | orphaned |
| **17.1** | ✅ | `from_rdsap_schema_17_1` | ❌ | ❌ | orphaned |
| **17.0** | ✅ | `from_rdsap_schema_17_0` | ❌ | ❌ | orphaned |
- 19.0 confirmed same reduced-field shape as 20.0.0: `glazed_area: int` band + `multiple_glazing_type:
int`, and the mapper currently hardcodes `sap_windows=[]` — i.e. the exact windowless-corruption bug
that 20.0.0's synthesis fixed. 18.0/17.1/17.0 are almost certainly the same family.
- The 1719 mapper methods **exist** but are unreachable: `from_api_response` only branches 21.0.1 /
21.0.0 / 20.0.0; everything else hits `raise ValueError(f"Unsupported EPC schema")`.
- **Corpora are harvestable.** `downloads/README.txt` schema-by-year:
`2020 → RdSAP-Schema-19.0 (1632)`, `20212024 → 20.0.0`, `20252026 → 21.0.1`. Older RdSAP (17.x/18.0)
live in the 20122019 dumps (all present locally). `SAP-Schema-1x` (full/design SAP) and `CEPC-*`
(commercial) are different families with no RdSAP mapper.
---
## Decision tree to grill (each has a recommended answer)
### Q1 (ROOT) — Target set and order. What are we generalising to, and in what order?
**Recommend:** the **pre-SAP10 RdSAP family only**, one spec at a time, **19.0 first** (dominant in the
2020 dump, closest sibling to 20.0.0, mapper already stubbed), then 18.0 → 17.1 → 17.0 as their dumps
confirm volume. **Exclude** `SAP-Schema-1x` (full/design SAP — new-build, not reduced; a separate
mapper family and ADR) and `CEPC-*` (non-domestic). **Carve out** 21.0.0 as a quick win: it's already
dispatched, it just needs a harvested corpus to join the strict guard.
*Sub-question:* do we batch all four 1719 in one branch sweep, or land 19.0 fully (corpus → schema →
synthesis → dispatch → guard) before starting 18.0? Recommend: **land 19.0 end-to-end first** — it
either confirms the playbook transfers cleanly (then 18.0/17.x are fast) or surfaces drift early.
### Q2 — Coefficient reuse vs re-fit (the load-bearing, ADR-worthy one).
20.0.0's glazing synthesis uses `0.148 × TFA × band_multiplier`, fit from the **21.0.1** corpus's
glazing/floor ratio. For 19.0/18.0/17.x: reuse the same coefficients, or re-fit per spec?
**DIRECTION (user, 2026-06-10): re-work the coefficients from each new corpus's own data — do not
inherit the 21.0.1 fit by default.** Treat `0.148` + the band multipliers as a *starting hypothesis* to
confirm or replace against what the new corpus actually shows, per spec. The empirical numbers lead; we
only keep the 20.0.0 values if the new corpus reproduces them.
**The constraint this hits (must resolve while grilling):** a reduced schema does **not** measure
per-window area (that's the whole reason synthesis exists), so a 19.0/18.0/17.x corpus *cannot
self-fit the glazing/floor ratio* — there's no measured glazing column in it to regress on. So
"work it out from the new corpus" splits into two parts:
- **What the reduced corpus *can* give us directly** → re-derive per spec: the `glazed_area` band
*distribution* (how many Normal/More/Less), `total_floor_area` distribution, and whether the band
codes/semantics match 20.0.0. This validates (or breaks) the band-multiplier assumption empirically.
- **The base ratio itself (the `0.148`)** → needs a *measured* reference from the same stock/era.
Options to grill: (a) use the contemporaneous measured corpus if one exists for that year (e.g. a
rich-window spec lodged alongside), (b) fit from the handful of rich certs the reduced corpus *does*
carry (20.0.0 had 7/1000 with lodged `sap_windows` — check the count per spec), or (c) fall back to
the 21.0.1 fit *only* if (a)/(b) yield too little signal, and say so explicitly.
This moves every rebaselined score for the spec, so the per-spec fit + its evidence wants an ADR
(extends ADR-0027). Record the derivation (corpus, sample size, quartiles) the same way 20.0.0 did.
### Q3 — Code-space drift across versions.
Do 1719 use the same integer code spaces as 20.0.0 (glazing_type, built_form, orientation, fuel,
heat-emitter, party-wall, roof/floor construction)? 20.0.0's codes turned out **identical** to 21.0.1's,
so we routed through the existing cascades verbatim. **Recommend:** assume identical within the RdSAP
family; cross-check each version against `epc_codes.csv` during grilling and add a per-spec cascade
override *only* where the corpus proves a code diverged. Don't pre-build translation layers.
### Q4 — Schema-fix mechanism. Same `kw_only` + data-driven required→optional?
**Recommend: yes, unchanged.** Each placeholder schema over-constrains identically (single-example
generation). Run the one-pass corpus scan to enumerate all missing-required fields at once (not
whack-a-mole), then default them. Mechanical, low-risk, proven.
### Q5 — Shared synthesis helper vs per-mapper copy (the architecture fork).
20.0.0's synthesis lives in `_synthesise_20_0_0_sap_windows` + inline mapper blocks. With 19.0 we'll
have a **second instance** — the classic extract trigger. **Recommend:** once 19.0 is green, extract a
single spec-parameterised `_synthesise_reduced_field_windows(glazed_area, tfa, glazing_type)` (and
shared lighting/hot-water/ventilation helpers) so 18.0/17.x are near-free and the coefficients live in
exactly one place. Defer the extraction until 19.0 confirms the shape (avoid abstracting from one
example). This is the `/improve-codebase-architecture` hook — a deep module behind a small interface.
### Q6 — Per-spec field availability.
Do 1719 actually lodge the synthesis *inputs* 20.0.0 relies on — `instantaneous_wwhrs` (bath room
counts), `low_energy_fixed_lighting_outlets_count`, `percent_draughtproofed`, `open_fireplaces_count`,
`multiple_glazing_type`? Older specs may omit or rename some. **Recommend:** profile each corpus up
front (one-pass field-presence scan); where a 20.0.0 input is absent, degrade gracefully to the
calculator's own default rather than fabricating — and record the gap in a test name.
### Q7 — Dispatch wiring + acceptance bar.
**Recommend:** per spec, add the `schema_type` branch to `from_api_response` (wrapped in
`_clear_basement_flag_when_system_built` like the others) and promote its corpus into the strict
parse+map bucket in `test_mapper_corpus.py`. Smoke-check with `scripts/eon/find_epc_data.py` (extend
the UPRN list with that spec's certs) — our re-score should track the lodged figure within a sane band.
The formal SAP-score *value* test stays deferred (same as 20.0.0) until we choose to land it.
### Q8 — Validation-Cohort / is there ANY cross-check?
Same rule as 20.0.0: a pre-SAP10 cert has no same-spec lodged figure to validate against, so synthesis
assumptions go in code/test names. **But probe one opportunistic anchor:** a single UPRN re-lodged
across spec versions (e.g. a dwelling with both a 19.0 and a 20.0.0 cert, unchanged between) — our
re-score of both should roughly agree. **Recommend:** if dual-lodged UPRNs surface during harvest, keep
a handful as a cross-spec regression anchor; don't block on it.
---
## How to reproduce / kick off (19.0 first)
```bash
# 1. Confirm 19.0 volume + reduced-field shape in the 2020 dump
python - <<'EOF'
import json
from collections import Counter
# stream the first N lines of certificates-2020.json, bucket by schema_type,
# and dump one RdSAP-Schema-19.0 document to inspect glazed_area / sap_windows
EOF
# 2. Add a harvest source row and run it
# scripts/eon/harvest_certs.py SOURCES += ("certificates-2020.json","RdSAP-Schema-19.0",1000)
# 3. Drive the (orphaned) 19.0 mapper against the new corpus to bucket parse failures
python - <<'EOF'
import json, collections
from pathlib import Path
from datatypes.epc.schema.rdsap_schema_19_0 import RdSapSchema19_0
from datatypes.epc.schema.helpers import from_dict
certs=[json.loads(l) for l in Path("backend/epc_api/json_samples/RdSAP-Schema-19.0/corpus.jsonl").read_text().splitlines() if l.strip()]
b=collections.Counter()
for c in certs:
try: from_dict(RdSapSchema19_0, c)
except Exception as e: b[f"{type(e).__name__}: {str(e)[:70]}"]+=1
for k,v in b.most_common(): print(v,k)
EOF
```
## References
- **ADR-0027** (`docs/adr/0027-rdsap-20-0-0-reduced-field-synthesis.md`) — the synthesis decision,
coefficients, rejected alternatives. Extend (not replace) for the family-wide coefficient choice (Q2).
- **ADR-0015** (mappers own cert normalization), **ADR-0004** (lodged-vs-effective pair).
- **CONTEXT.md**_Reduced-Field Synthesis_, _Rebaselining_, _Lodged / Effective Performance_,
_Validation Cohort_, _pre-SAP10_.
- **20.0.0 resume doc**`docs/grill-sessions/2026-06-09-rdsap-20-0-0-remapper.md` (the worked example).

1
next_claude_prompt.txt Normal file
View file

@ -0,0 +1 @@
/grill-me docs/grill-sessions/2026-06-10-pre-sap10-mapper-generalization.md

View file

@ -124,16 +124,16 @@ def _s3_parquet_reader(bucket: str) -> ParquetReader:
return read
def _spatial_for(
repo: GeospatialS3Repository, uprn: int
) -> Optional[SpatialReference]:
def _spatial_for(repo: GeospatialS3Repository, uprn: int) -> Optional[SpatialReference]:
"""The UPRN's spatial reference (coordinates + planning protections), or
None when S3 doesn't cover it — a missing reference must not abort the run,
so a lookup error degrades to None (unrestricted, no solar)."""
try:
return repo.spatial_for(uprn)
except Exception as error: # noqa: BLE001 — S3/parquet hiccup is non-fatal
print(f" spatial lookup failed for uprn {uprn}: {type(error).__name__}: {error}")
print(
f" spatial lookup failed for uprn {uprn}: {type(error).__name__}: {error}"
)
return None
@ -186,7 +186,9 @@ def _parse_measures(raw: Optional[str]) -> Optional[frozenset[MeasureType]]:
(consider every modelled measure) when unset. Raises on an unknown type."""
if raw is None:
return None
return frozenset(MeasureType(token.strip()) for token in raw.split(",") if token.strip())
return frozenset(
MeasureType(token.strip()) for token in raw.split(",") if token.strip()
)
def _context_summary(
@ -252,8 +254,12 @@ def _persist(
def main() -> None:
parser = argparse.ArgumentParser(description=__doc__)
parser.add_argument("property_ids", type=int, nargs="+", help="Property ids to model")
parser.add_argument("--goal", default="C", help="target band when no --scenario-id (default C)")
parser.add_argument(
"property_ids", type=int, nargs="+", help="Property ids to model"
)
parser.add_argument(
"--goal", default="C", help="target band when no --scenario-id (default C)"
)
parser.add_argument(
"--scenario-id", type=int, default=None, help="model against this DB Scenario"
)
@ -263,12 +269,16 @@ def main() -> None:
help="comma-separated measure types to consider (default: all)",
)
parser.add_argument(
"--portfolio-id", type=int, default=None, help="portfolio id (required for --persist)"
"--portfolio-id",
type=int,
default=None,
help="portfolio id (required for --persist)",
)
parser.add_argument(
"--persist",
action="store_true",
help="WRITE the inputs + Plan to the DB (default: inspect only, no writes)",
default=False,
)
parser.add_argument(
"--no-solar",
@ -355,7 +365,9 @@ def main() -> None:
solar_insights=solar_insights,
plan=plan,
)
except Exception as error: # noqa: BLE001 — one bad property must not stop the run
except (
Exception
) as error: # noqa: BLE001 — one bad property must not stop the run
line = f"property {property_id} (uprn {uprn}): ERROR — {type(error).__name__}: {error}"
print(line + "\n")
md_lines.append(f"## Property {property_id}\n\n`{line}`\n")