mirror of
https://github.com/Hestia-Homes/Model.git
synced 2026-06-08 11:17:27 +00:00
User-driven pivot to the cohort-first validation strategy: the 6
existing hand-built `_elmhurst_worksheet_NNNNNN.build_epc()` fixtures
already cascade to their worksheet PDFs at 1e-4 — they ARE the
100%-correct calculator-input ground truth. Adding diff tests that
assert `from_elmhurst_site_notes(pdf) == hand_built()` surfaces every
silent divergence the existing chain tests miss (because chain tests
only check cascade output, not field-level EpcPropertyData equality).
Adds `test_from_elmhurst_site_notes_matches_hand_built_000474` as the
tracer-bullet first cohort case. The test:
1. Maps Summary_000474.pdf through the Elmhurst extractor + mapper.
2. Builds the hand-built EpcPropertyData via
`_elmhurst_worksheet_000474.build_epc()`.
3. Recursively diffs the two across a `_LOAD_BEARING_FIELDS`
allow-list (40 top-level fields driving the SAP cascade or
cross-mapper semantic equivalence; explicitly excludes cert
metadata, EnergyElement descriptive lists, registration dates,
and other fields that vary by mapper pathway without semantic
disagreement — these are noise per user decision).
RED status committed as the load-bearing TDD forcing function:
50 load-bearing divergences across 4 categories:
Cat A — encoding-only / cascade-equivalent (~30 diffs):
* Ventilation flue counts `0 vs None` (cascade defaults None to 0)
* Dual-encoded sub-fields (`floor_construction_type` str-side,
`roof_insulation_location` str-side, etc.)
* Mapper-surfaces-descriptive-only fields (`floor_type`,
`floor_u_value_known`)
Cat B — real cascade-affecting gaps (~10 diffs):
* `sap_heating.water_heating_fuel`: None vs 26 (mains gas)
* `sap_heating.shower_outlets`: extracted vs None
* `sap_heating.number_baths`: 1 vs None
* `country_code`: None vs 'ENG'
* `built_form`: 'Mid-Terrace' vs None
* `boiler_flue_type`, `central_heating_pump_age` dual-encoding
* `dwelling_type` casing 'Mid-Terrace house' vs 'Mid-terrace house'
* `wall_thickness_measured`: True vs False
Cat C — structural shape divergences (1 diff):
* `sap_windows: LEN 7 vs 5` — mapper extracts 1:1 with §11 table;
cohort hand-built collapsed entries by glazing-type group
(preserving total area, cascade-equivalent but not field-equal).
Cat D — Slice-54-style hand-built staleness (~5 diffs):
* `extensions_count: 2 vs 0` — Slice 54 fix landed on mapper;
hand-built still uses old hardcoded 0
* `party_wall_construction: None vs 0` — cohort convention sentinel
* Hand-built ages prior to current mapper conventions
Two RED forcing functions on the branch now:
- test_summary_001479_full_chain_sap_matches_worksheet_pdf_exactly
(delta 1.19 SAP vs 69.0094)
- test_from_elmhurst_site_notes_matches_hand_built_000474
(50 load-bearing field divergences)
Strict-pyright net-zero on the chain test file (0 errors); cohort
chain tests all still pass (13 green / 2 RED).
Next slices will chip away at the diff list — bulk-update cohort
hand-builts for Cat A/D (mechanical) then attack Cat B/C with
per-field design decisions. Once 000474 closes, parametrize over
the 5 other cohort certs, then API-mapper diff test, then cross-
mapper parity falls out.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| handler | ||
| tests | ||
| __init__.py | ||
| db_writer.py | ||
| elmhurst_extractor.py | ||
| extractor.py | ||
| local_runner.py | ||
| parser.py | ||
| pdf.py | ||