Model/backend/documents_parser
Khalim Conn-Kowlessar 7530ed3f4a Slice S0380.134: pin corpus PE against cascade demand-mode (apples-to-apples)
The SAP 10.2 worksheet computes each existing-dwelling metric in two
distinct blocks:

  1. "ENERGY RATING" block — uses Table 12 regulated prices + UK-
     average climate. Produces SAP score (Block 11a), total fuel
     cost (255), total CO2 (272).
  2. "EPC COSTS, EMISSIONS AND PRIMARY ENERGY" block — uses Table 32
     prices + postcode-specific climate. Produces total CO2 (272)
     again with different value, total PE (286).

The two blocks operate on different space-heating demand kWh per
SAP 10.2 §13 (e.g. solid fuel 8: 21097 kWh in rating block vs
16813 kWh in EPC block for London W6).

The corpus regression test was extracting all four pins and asserting
against the cascade's rating-mode result (`cert_to_inputs`). That was
apples-to-apples for SAP/cost/CO2 (the first `(255)` and `(272)`
matches the regex finds ARE in the rating block) but apples-to-
oranges for PE: the `(286)` Total PE only exists in the EPC block,
so every PE pin was comparing rating-mode cascade output against
EPC-block worksheet output. The mismatch inflated every PE residual
by 10-15% of total PE.

The fix runs both cascade modes in the Act phase and assigns:

  - rating-mode result → SAP / cost / CO2 residuals
  - demand-mode result (`cert_to_demand_inputs`) → PE residual

25 corpus _CorpusExpectation entries re-pinned. Some closed
dramatically (apples-to-apples reveals the cascade was actually
correct):

  ashp         +1467.90 → -11.80  ← effectively closed
  oil pcdb 1/2 +2086.75 → -83.82
  oil pcdb 3   +1897.43 → -271.44
  electric 1   +2837.14 → +164.91
  electric 8   +2113.83 → -224.46
  solid fuel 5 +2359.85 → -330.84

Others surfaced larger demand-mode gaps that the block mismatch had
been hiding — these are real cascade gaps the next slices will
address:

  electric 3       -850.93 → -3189.22
  electric 5/6     +540/+568 → -1797.96 / -1769.84
  pcdb 1           -171.70 → -3135.30
  solid fuel 2/3   +440.75 / +1451.79 → -2292.47 / -2496.20

The corpus test docstring + per-block-attribution comment now make
the rating-vs-EPC block distinction explicit so future reviewers
don't repeat the same conflation.

Extended handover suite at HEAD post-slice: 876 pass / 0 fail
(unchanged — no test count change, just per-pin value updates).

Pyright net-zero on touched file (0 → 0).

No cascade behaviour change. No golden / unit-test impact (the bug
was specific to the corpus test's pin-extraction logic).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-31 10:41:47 +00:00
..
handler address JTK review comments 2026-04-20 15:11:17 +00:00
tests Slice S0380.134: pin corpus PE against cascade demand-mode (apples-to-apples) 2026-05-31 10:41:47 +00:00
__init__.py Map to RdSapSiteNotes from site notes JSON 🟥 2026-04-16 13:54:03 +00:00
db_writer.py include updating epc_property_data to pashub to ara workflow 2026-04-29 09:55:14 +00:00
elmhurst_extractor.py Slice S0380.133: derive solid-fuel main fuel from §14.0 EES Code 2026-05-31 10:04:28 +00:00
extractor.py Handle wall thickness "Unmeasurable" 🟩 2026-04-30 16:41:16 +00:00
local_runner.py update local runner to work for elmhurst 2026-04-24 14:01:36 +00:00
parser.py load ecmk site notes to db 2026-04-29 11:20:47 +00:00
pdf.py update local runner to work for elmhurst 2026-04-24 14:01:36 +00:00