Adds the (API JSON + Summary PDF) fixtures for cert
0330-2249-8150-2326-4121 — the boiler pilot identified in the
handover. Property: 17 Summerfield Road, MANCHESTER M22 1AE
(mid-terrace house, mains gas boiler PCDB idx 10241, age D).
Source: API JSON fetched via EpcClientService from
https://api.get-energy-performance-data.communities.gov.uk
(OPEN_EPC_API_TOKEN). Summary PDF copied from
`sap worksheets/Additional data with api/0330-2249-8150-2326-4121/
Summary_000897.pdf` (where the user provided the triple).
Worksheet target: SAP 61.5993 (continuous), from `dr87-0001-000897
.pdf` in the same source directory.
Current state on these fixtures (uncommitted before this slice):
- Summary mapper cascade SAP: 62.0660 (Δ +0.4667 vs worksheet)
- API mapper cascade SAP: 63.7446 (Δ +2.1453 vs worksheet)
Both paths RED at 1e-4. Two specific cascade-component gaps
identified in the handover for follow-up slices:
1. Windows HLC +6.71 W/K (API vs Summary) — likely glazing_type=14
not in Slice 93's `_API_GLAZING_TYPE_TO_TRANSMISSION` (only
codes 3 and 13 mapped).
2. HW kWh +1060 (API 3172.65 vs Summary 2112.00) — §4 subsystem
gap; needs occupancy/shower/cylinder probe.
This commit stages the fixtures only — no tests added yet. The
follow-up slice should add a RED Layer 2 test (Summary path 1e-4
vs 61.5993) and proceed slice-by-slice.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Rewrites the cert 001479 closure handover into a forward-looking
brief for the new workstream: validating the API EpcPropertyDataMapper
against 9 newly-staged (Summary + worksheet + API) cert triples.
Key contents:
- User's stated workflow (verbatim): Summary path proves itself
against the worksheet → becomes canonical reference for API parity.
- Folder-structure changes since the prior handover were written
(packages/domain/ removed; sap10_calculator + sap10_ml now at the
repo root under a PEP 420 namespace; docs/sap-spec/ moved into
domain/sap10_calculator/docs/; PCDB data into tables/pcdb/data/).
- New test data layout: `sap worksheets/Additional data with api/
<cert-ref>/{Summary_NNNNNN.pdf, dr87-0001-NNNNNN.pdf}`.
- Cert reference table with heating type, PCDB index, worksheet SAP,
TFA, bp count, dwelling type for all 9 triples.
- Major scope discovery: 7 of 9 are Air Source Heat Pumps (PCDB
104568 / 102421). The mapper has never been validated against HPs;
cert 0380 pilot showed catastrophic deltas (Summary -70 / API -18
SAP vs worksheet). Recommended deferring HP certs until boiler
workflow is proven.
- Cert 0330 (mid-terrace gas boiler) pilot status: fixtures staged
uncommitted; Summary path +0.47 SAP, API path +2.15 SAP vs
worksheet 61.5993. Cascade-component diff localises 2 specific
gaps (windows HLC +6.71 W/K likely from glazing_type=14 missing
from Slice 93's transmission map; HW kWh +1060 needs §4
subsystem probe).
- Tooling shortcut: use OPEN_EPC_API_TOKEN (not EPC_AUTH_TOKEN) in
backend/.env with EpcClientService._fetch_certificate(cert_ref)
to fetch raw JSON.
- First actions for next agent: confirm baseline, commit cert 0330
fixtures, add RED Layer 2 test, iterate.
Lesson preserved: cohort hand-builts encode non-spec quirks
(e.g. has_suspended_timber_floor=False to override §(12) spec
inference and match the non-spec worksheet). Cross-check against
spec-inferred mapper output before trusting hand-built fields.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Locality of reference — SAP-specific docs, specs, and runtime data
now live alongside the calculator that consumes them, mirroring the
prior packages→domain layout moves.
Move targets:
- Narrative MDs → domain/sap10_calculator/docs/
NEXT_AGENT_PROMPT.md, HANDOVER_NEXT.md, SAP_CALCULATOR.md
- Spec PDFs → domain/sap10_calculator/docs/specs/
RdSAP 10 Specification 10-06-2025.pdf
PCDF_Spec_Rev-06b_12_May_2021.pdf
sap-10-2-full-specification-2025-03-14.pdf
sap-10-3-full-specification-2026-01-13.pdf
- PCDB runtime data → domain/sap10_calculator/tables/pcdb/data/
pcdb10.dat (8.3MB) + 7× pcdb_table_*.jsonl (18MB total)
Path code rewrites (load-bearing):
- tables/pcdb/__init__.py: replaced parents[4]/'docs'/'sap-spec' with
Path(__file__).resolve().parent/'data' for Table 105 JSONL loading.
- tables/pcdb/postcode_weather.py: same rebase for the pcdb10.dat path
read by _postcode_climate_table().
- tables/pcdb/etl.py __main__: same rebase for the manual ETL invocation
(source + output_dir both now point inside the package).
- tests/test_pcdb_etl.py: _PCDB_DAT_PATH now derives from
parents[1]/'tables'/'pcdb'/'data' (was parents[3]/'docs'/'sap-spec').
Citation rewrites:
- 12 .py docstrings and 4 .md docs (ADRs + READMEs + narrative docs)
had `docs/sap-spec/<file>` strings rewritten to their new locations.
- Two cases where the catch-all sed misfired (an ADR-0009 line about a
PCDB extract; the pcdb __init__.py docstring about ETL output) were
hand-corrected to point at tables/pcdb/data/ rather than docs/specs/.
docs/sap-spec/ is now empty (will be removed in a follow-up sweep or
left as a vestigial empty dir for future repurposing). ADRs 0009 and
0010 remain at docs/adr/ — they're part of the chronological
cross-cutting decision log, not calculator-specific narrative.
Verified:
- Calculator's 1e-4 production gate
(test_api_001479_full_chain_sap_matches_worksheet_pdf_exactly) GREEN.
- Wider sweep (domain/sap10_calculator/ + domain/sap10_ml/): 1654
passed / 20 failed — exact pre-move baseline. All 20 failures
pre-existing (10 hand-built skeleton + 4 cohort chain + 6 cohort
diff).
- Pyright net-zero on the 4 touched runtime/test files (0 errors)
and unchanged on heat_transmission.py (13) / cert_to_inputs.py (35) /
mapper.py (33).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sibling migration to the sap10_calculator move — `domain.ml` now lives
at the root-level layout (`domain/sap10_ml/`) matching the pattern
already used by `domain.addresses`, `domain.tasks`, `domain.postcode`,
and `domain.sap10_calculator`.
Changes:
- `git mv packages/domain/src/domain/ml → domain/sap10_ml` (19 files;
history preserved).
- Subpackage rename: `domain.ml` → `domain.sap10_ml`. 32 references
rewritten across .py and .md files: 11 internal + 21 external
(datatypes/epc/domain/mapper.py, 14 files in domain/sap10_calculator,
2 backend tests, 2 ADRs, 1 README, 1 design doc).
- Path-string updates: `pytest.ini` testpath
`packages/domain/src/domain/ml/tests` → `domain/sap10_ml/tests` so
ML tests stay in the default auto-discovered sweep. `CONTEXT.md`
also updated.
`packages/domain/src/domain/` is now empty — the workspace `domain/`
tree has been fully migrated. Together with the `domain/__init__.py`
deletions from the sap10_calculator commit (29ac35cc), `domain` is
now a single root-level namespace package with subpackages
{addresses, sap10_calculator, sap10_ml, tasks} + the standalone
`postcode.py` module.
Verified:
- Focused sweep (backend mapper-chain + sap10_calculator worksheet
e2e + golden fixtures): 99 passed / 19 failed — identical baseline.
- Wider sweep (all sap10_calculator + sap10_ml): 1654 passed / 20
failed (same pre-existing failures).
- domain/sap10_ml/tests: 210/210 PASSED at new path.
- Pyright net-zero: heat_transmission.py 13, cert_to_inputs.py 35,
mapper.py 33, rdsap_uvalues.py 1 (all unchanged from baseline).
Note: `packages/domain/pyproject.toml` still declares
`packages = ["src/domain"]` for the hatchling wheel — that target
directory is now empty and the wheel build is effectively a no-op.
Retiring the workspace package or repointing the wheel is a follow-up.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Migration of the SAP 10.2 calculator package from the uv-workspace
src-layout (`packages/domain/src/domain/sap`) to the root-level layout
(`domain/sap10_calculator`), matching the pattern already used by
`domain.addresses` / `domain.tasks` / `domain.postcode`.
Changes:
- `git mv packages/domain/src/domain/sap → domain/sap10_calculator`
(92 files; git auto-detected all as renames so blame/history is
preserved).
- Subpackage rename: `domain.sap` → `domain.sap10_calculator`. 48
Python files rewritten (`from domain.sap.X` → `from domain.sap10_
calculator.X`); zero remaining `domain.sap` refs after the sed pass.
- Path-string updates: 3 .py files (test fixtures + xlsx loader) +
6 markdown docs (CONTEXT.md, 2 ADRs, 3 sap-spec docs, sap10_
calculator/README.md) had hard-coded `packages/domain/src/domain/
sap/...` paths rewritten to `domain/sap10_calculator/...`.
- `Path(__file__).parents[N]` rebasing: the old tree was 3 levels
deeper than the new one (`packages/domain/src/`), so 4× `parents[7]`
became `parents[4]` and 1× `parents[6]` became `parents[3]` across
`tables/pcdb/{__init__.py, postcode_weather.py, etl.py}`,
`worksheet/tests/_xlsx_loader.py`, and `tests/test_pcdb_etl.py`.
- PEP 420 namespace package: deleted both `domain/__init__.py`
(root + workspace, both load-bearing only as empty/docstring) so
Python combines `domain.sap10_calculator` (root) and `domain.ml`
(workspace) into one namespace package. Confirmed via
`domain.__path__ == ['/workspaces/model/domain',
'/workspaces/model/packages/domain/src/domain']`. Without this,
the root `domain/__init__.py` shadowed the workspace one and
`domain.ml` was unreachable.
Verified:
- Full sweep (`backend/documents_parser/tests/test_summary_pdf_
mapper_chain.py + domain/sap10_calculator/worksheet/tests/test_
e2e_elmhurst_sap_score.py + domain/sap10_calculator/rdsap/tests/
test_golden_fixtures.py`): 99 passed / 19 failed — exact same
counts as pre-refactor. All 19 failures pre-existing (9 hand-built
001479 + 6 cohort diff + 4 cohort chain non-spec).
- Wider sweep (all sap10_calculator + domain.ml): 1654 passed /
20 failed (the +1 vs the focused sweep is the pre-existing
`test_roof_insulated_assumed_with_ni_thickness_uses_50mm_per_
section_5_11_4` which was already failing on the previous baseline).
- Pyright net-zero on the three load-bearing baselines:
`heat_transmission.py` 13, `cert_to_inputs.py` 35, `mapper.py` 33.
Lift-and-shift only — no semantic renames (`Sap10Calculator` stays
`Sap10Calculator`), no testpaths edits in pytest.ini (sap tests
continue to be invoked by explicit pytest paths).
Note: `domain.ml` still lives at `packages/domain/src/domain/ml/`.
Migrating it would close out the dual-`domain/` layout but is
out of scope for this commit.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Slice 1/6 of the postcode_splitter refactor (Hestia-Homes/Model#1100).
Introduces the pure-domain foundation under domain/, with no AWS, Postgres,
or pandas. UserAddress is a frozen dataclass that sanitises its postcode in
__post_init__ via the canonical sanitise_postcode helper, and
iter_postcode_grouped_batches preserves the legacy splitter's batching
invariants (group-by-postcode in insertion order, never split a group,
oversize single-postcode groups dispatched whole, final flush). Updates
UBIQUITOUS_LANGUAGE.md so the User Address term covers both the dataclass
sense (preferred in domain code) and the raw upstream-string sense.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>