From 7530ed3f4a5d8ae41a8cc7d2e548f8d91cf666b2 Mon Sep 17 00:00:00 2001 From: Khalim Conn-Kowlessar Date: Sun, 31 May 2026 10:41:47 +0000 Subject: [PATCH] Slice S0380.134: pin corpus PE against cascade demand-mode (apples-to-apples) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The SAP 10.2 worksheet computes each existing-dwelling metric in two distinct blocks: 1. "ENERGY RATING" block — uses Table 12 regulated prices + UK- average climate. Produces SAP score (Block 11a), total fuel cost (255), total CO2 (272). 2. "EPC COSTS, EMISSIONS AND PRIMARY ENERGY" block — uses Table 32 prices + postcode-specific climate. Produces total CO2 (272) again with different value, total PE (286). The two blocks operate on different space-heating demand kWh per SAP 10.2 §13 (e.g. solid fuel 8: 21097 kWh in rating block vs 16813 kWh in EPC block for London W6). The corpus regression test was extracting all four pins and asserting against the cascade's rating-mode result (`cert_to_inputs`). That was apples-to-apples for SAP/cost/CO2 (the first `(255)` and `(272)` matches the regex finds ARE in the rating block) but apples-to- oranges for PE: the `(286)` Total PE only exists in the EPC block, so every PE pin was comparing rating-mode cascade output against EPC-block worksheet output. The mismatch inflated every PE residual by 10-15% of total PE. The fix runs both cascade modes in the Act phase and assigns: - rating-mode result → SAP / cost / CO2 residuals - demand-mode result (`cert_to_demand_inputs`) → PE residual 25 corpus _CorpusExpectation entries re-pinned. Some closed dramatically (apples-to-apples reveals the cascade was actually correct): ashp +1467.90 → -11.80 ← effectively closed oil pcdb 1/2 +2086.75 → -83.82 oil pcdb 3 +1897.43 → -271.44 electric 1 +2837.14 → +164.91 electric 8 +2113.83 → -224.46 solid fuel 5 +2359.85 → -330.84 Others surfaced larger demand-mode gaps that the block mismatch had been hiding — these are real cascade gaps the next slices will address: electric 3 -850.93 → -3189.22 electric 5/6 +540/+568 → -1797.96 / -1769.84 pcdb 1 -171.70 → -3135.30 solid fuel 2/3 +440.75 / +1451.79 → -2292.47 / -2496.20 The corpus test docstring + per-block-attribution comment now make the rating-vs-EPC block distinction explicit so future reviewers don't repeat the same conflation. Extended handover suite at HEAD post-slice: 876 pass / 0 fail (unchanged — no test count change, just per-pin value updates). Pyright net-zero on touched file (0 → 0). No cascade behaviour change. No golden / unit-test impact (the bug was specific to the corpus test's pin-extraction logic). Co-Authored-By: Claude Opus 4.7 --- .../tests/test_heating_systems_corpus.py | 101 ++++++++++++------ 1 file changed, 67 insertions(+), 34 deletions(-) diff --git a/backend/documents_parser/tests/test_heating_systems_corpus.py b/backend/documents_parser/tests/test_heating_systems_corpus.py index 7be9cb03..314a12b1 100644 --- a/backend/documents_parser/tests/test_heating_systems_corpus.py +++ b/backend/documents_parser/tests/test_heating_systems_corpus.py @@ -10,9 +10,23 @@ controlled-variable signal this corpus was built to exercise. Per variant we extract Block 11a (individual heating) or Block 11b (community heating) pins from the P960 worksheet PDF, route the Summary PDF through `ElmhurstSiteNotesExtractor` → `from_elmhurst_site_notes` → -`cert_to_inputs` → `calculate_sap_from_inputs`, and assert each of the -four published outputs (continuous SAP, total fuel cost, CO2, PE) -matches its pinned residual within a tight absolute tolerance. +`cert_to_inputs` / `cert_to_demand_inputs` → `calculate_sap_from_inputs`, +and assert each of the four published outputs matches its pinned +residual within a tight absolute tolerance. + +The SAP 10.2 worksheet computes each existing-dwelling metric in two +distinct blocks: the "ENERGY RATING" block (uses Table 12 regulated +prices + UK-average climate; produces SAP score, total fuel cost, +CO2) and the "EPC COSTS, EMISSIONS AND PRIMARY ENERGY" block (uses +Table 32 prices + postcode-specific climate; produces Primary Energy). +The two blocks operate on different space-heating demand kWh values. +To compare apples-to-apples the corpus pins the worksheet's rating- +block (SAP / cost / CO2) against the cascade's rating-mode result +(`cert_to_inputs`) and the worksheet's EPC-block (PE) against the +cascade's demand-mode result (`cert_to_demand_inputs`). Pre-S0380.134 +all four pins compared against rating-mode, which inflated every PE +residual by ~10-15% of total PE because the worksheet (286) Total PE +only appears in the EPC block. Residuals are non-zero today: the cascade overshoots most variants by +1..+30 SAP points (with `community heating 6` undershooting at −6.87, @@ -41,6 +55,7 @@ from domain.sap10_calculator.calculator import calculate_sap_from_inputs from domain.sap10_calculator.exceptions import MissingMainFuelType from domain.sap10_calculator.rdsap.cert_to_inputs import ( SAP_10_2_SPEC_PRICES, + cert_to_demand_inputs, cert_to_inputs, ) @@ -105,22 +120,34 @@ class _CorpusExpectation: # logs). All 10 close to ΔSAP ±7.4; solid fuel 5 +2.71 is the # smallest open. 16 variants remain blocked (community heating, # 4 electric storage codes, no system, oil non-Heating-oil, Bulk LPG). +# +# Slice S0380.134 fixed a measurement bug in the PE pin: the +# worksheet (286) Total PE only exists in the EPC block (uses +# postcode-specific climate + demand-mode space heating kWh), so +# comparing it against the cascade's rating-mode PE inflated every +# PE residual by 10-15% of total PE. The pin now compares the +# worksheet (286) against the cascade's demand-mode PE +# (`cert_to_demand_inputs`). Multiple variants closed dramatically +# (ashp +1468 → -12; oil pcdb 1/2 +2087 → -84; electric 1 +2837 → +# +165; electric 8 +2114 → -224); others surfaced larger demand- +# mode residuals that were hidden by the block mismatch (electric +# 3/5/6/7/9, pcdb 1, solid fuel 2-11). _EXPECTATIONS: tuple[_CorpusExpectation, ...] = ( - _CorpusExpectation(variant='ashp', block='11a', expected_sap_resid=+5.6680, expected_cost_resid_gbp=-130.5995, expected_co2_resid_kg=-1.4283, expected_pe_resid_kwh=+1467.8983), - _CorpusExpectation(variant='electric 1', block='11a', expected_sap_resid=+9.6439, expected_cost_resid_gbp=-222.2109, expected_co2_resid_kg=+14.3441, expected_pe_resid_kwh=+2837.1414), - _CorpusExpectation(variant='electric 2', block='11a', expected_sap_resid=+5.8523, expected_cost_resid_gbp=-134.8455, expected_co2_resid_kg=+94.4364, expected_pe_resid_kwh=+2420.9013), - _CorpusExpectation(variant='electric 3', block='11a', expected_sap_resid=+14.6973, expected_cost_resid_gbp=-338.6485, expected_co2_resid_kg=-379.1296, expected_pe_resid_kwh=-850.9293), - _CorpusExpectation(variant='electric 5', block='11a', expected_sap_resid=+10.9720, expected_cost_resid_gbp=-252.8131, expected_co2_resid_kg=-218.5642, expected_pe_resid_kwh=+540.3309), - _CorpusExpectation(variant='electric 6', block='11a', expected_sap_resid=+10.9720, expected_cost_resid_gbp=-252.8131, expected_co2_resid_kg=-209.8689, expected_pe_resid_kwh=+568.4500), - _CorpusExpectation(variant='electric 7', block='11a', expected_sap_resid=+9.6834, expected_cost_resid_gbp=-223.1212, expected_co2_resid_kg=-137.9832, expected_pe_resid_kwh=+1061.3307), - _CorpusExpectation(variant='electric 8', block='11a', expected_sap_resid=+6.8875, expected_cost_resid_gbp=-158.6999, expected_co2_resid_kg=-34.9564, expected_pe_resid_kwh=+2113.8303), - _CorpusExpectation(variant='electric 9', block='11a', expected_sap_resid=+12.0340, expected_cost_resid_gbp=-277.2813, expected_co2_resid_kg=-255.6076, expected_pe_resid_kwh=+362.4518), - _CorpusExpectation(variant='gshp', block='11a', expected_sap_resid=+5.1598, expected_cost_resid_gbp=-118.8901, expected_co2_resid_kg=-41.4461, expected_pe_resid_kwh=+639.1890), - _CorpusExpectation(variant='oil 1', block='11a', expected_sap_resid=+2.6578, expected_cost_resid_gbp=-61.2402, expected_co2_resid_kg=-242.2677, expected_pe_resid_kwh=+1259.6587), - _CorpusExpectation(variant='oil pcdb 1', block='11a', expected_sap_resid=+0.4239, expected_cost_resid_gbp=-9.7668, expected_co2_resid_kg=-35.9551, expected_pe_resid_kwh=+2086.7505), - _CorpusExpectation(variant='oil pcdb 2', block='11a', expected_sap_resid=+0.4239, expected_cost_resid_gbp=-9.7668, expected_co2_resid_kg=-35.9551, expected_pe_resid_kwh=+2086.7505), - _CorpusExpectation(variant='oil pcdb 3', block='11a', expected_sap_resid=+1.1597, expected_cost_resid_gbp=-26.7204, expected_co2_resid_kg=-53.1709, expected_pe_resid_kwh=+1897.4341), - _CorpusExpectation(variant='pcdb 1', block='11a', expected_sap_resid=+6.9521, expected_cost_resid_gbp=-157.6055, expected_co2_resid_kg=-845.8065, expected_pe_resid_kwh=-171.6971), + _CorpusExpectation(variant='ashp', block='11a', expected_sap_resid=+5.6680, expected_cost_resid_gbp=-130.5995, expected_co2_resid_kg=-1.4283, expected_pe_resid_kwh=-11.8017), + _CorpusExpectation(variant='electric 1', block='11a', expected_sap_resid=+9.6439, expected_cost_resid_gbp=-222.2109, expected_co2_resid_kg=+14.3441, expected_pe_resid_kwh=+164.9052), + _CorpusExpectation(variant='electric 2', block='11a', expected_sap_resid=+5.8523, expected_cost_resid_gbp=-134.8455, expected_co2_resid_kg=+94.4364, expected_pe_resid_kwh=+970.7570), + _CorpusExpectation(variant='electric 3', block='11a', expected_sap_resid=+14.6973, expected_cost_resid_gbp=-338.6485, expected_co2_resid_kg=-379.1296, expected_pe_resid_kwh=-3189.2203), + _CorpusExpectation(variant='electric 5', block='11a', expected_sap_resid=+10.9720, expected_cost_resid_gbp=-252.8131, expected_co2_resid_kg=-218.5642, expected_pe_resid_kwh=-1797.9601), + _CorpusExpectation(variant='electric 6', block='11a', expected_sap_resid=+10.9720, expected_cost_resid_gbp=-252.8131, expected_co2_resid_kg=-209.8689, expected_pe_resid_kwh=-1769.8410), + _CorpusExpectation(variant='electric 7', block='11a', expected_sap_resid=+9.6834, expected_cost_resid_gbp=-223.1212, expected_co2_resid_kg=-137.9832, expected_pe_resid_kwh=-1276.9603), + _CorpusExpectation(variant='electric 8', block='11a', expected_sap_resid=+6.8875, expected_cost_resid_gbp=-158.6999, expected_co2_resid_kg=-34.9564, expected_pe_resid_kwh=-224.4607), + _CorpusExpectation(variant='electric 9', block='11a', expected_sap_resid=+12.0340, expected_cost_resid_gbp=-277.2813, expected_co2_resid_kg=-255.6076, expected_pe_resid_kwh=-1975.8392), + _CorpusExpectation(variant='gshp', block='11a', expected_sap_resid=+5.1598, expected_cost_resid_gbp=-118.8901, expected_co2_resid_kg=-41.4461, expected_pe_resid_kwh=-454.5023), + _CorpusExpectation(variant='oil 1', block='11a', expected_sap_resid=+2.6578, expected_cost_resid_gbp=-61.2402, expected_co2_resid_kg=-242.2677, expected_pe_resid_kwh=-1050.4919), + _CorpusExpectation(variant='oil pcdb 1', block='11a', expected_sap_resid=+0.4239, expected_cost_resid_gbp=-9.7668, expected_co2_resid_kg=-35.9551, expected_pe_resid_kwh=-83.8239), + _CorpusExpectation(variant='oil pcdb 2', block='11a', expected_sap_resid=+0.4239, expected_cost_resid_gbp=-9.7668, expected_co2_resid_kg=-35.9551, expected_pe_resid_kwh=-83.8239), + _CorpusExpectation(variant='oil pcdb 3', block='11a', expected_sap_resid=+1.1597, expected_cost_resid_gbp=-26.7204, expected_co2_resid_kg=-53.1709, expected_pe_resid_kwh=-271.4351), + _CorpusExpectation(variant='pcdb 1', block='11a', expected_sap_resid=+6.9521, expected_cost_resid_gbp=-157.6055, expected_co2_resid_kg=-845.8065, expected_pe_resid_kwh=-3135.2991), # Slice S0380.133 unblocked 10 solid-fuel variants by routing the # Elmhurst §14.0 "Main Heating EES Code" through the new # `_ELMHURST_MAIN_HEATING_EES_TO_FUEL_CODE` dict. Pre-slice the @@ -128,16 +155,16 @@ _EXPECTATIONS: tuple[_CorpusExpectation, ...] = ( # cost / CO2 / PE all route via the correct Table 32 fuel code. # Remaining residuals are likely heating-system efficiency or # control-type gaps — separate slices. - _CorpusExpectation(variant='solid fuel 2', block='11a', expected_sap_resid=+4.7910, expected_cost_resid_gbp=-110.3933, expected_co2_resid_kg=-484.3578, expected_pe_resid_kwh=+440.7506), - _CorpusExpectation(variant='solid fuel 3', block='11a', expected_sap_resid=+4.4310, expected_cost_resid_gbp=-102.0983, expected_co2_resid_kg=-1206.1483, expected_pe_resid_kwh=+1451.7872), - _CorpusExpectation(variant='solid fuel 4', block='11a', expected_sap_resid=+4.1283, expected_cost_resid_gbp=-95.1230, expected_co2_resid_kg=-714.4446, expected_pe_resid_kwh=+1655.3360), - _CorpusExpectation(variant='solid fuel 5', block='11a', expected_sap_resid=+2.7081, expected_cost_resid_gbp=-62.3977, expected_co2_resid_kg=-301.4166, expected_pe_resid_kwh=+2359.8540), - _CorpusExpectation(variant='solid fuel 6', block='11a', expected_sap_resid=-7.3846, expected_cost_resid_gbp=+168.2332, expected_co2_resid_kg=-153.6470, expected_pe_resid_kwh=+2519.2301), - _CorpusExpectation(variant='solid fuel 7', block='11a', expected_sap_resid=+5.8242, expected_cost_resid_gbp=-131.0462, expected_co2_resid_kg=-758.2093, expected_pe_resid_kwh=+2967.9919), - _CorpusExpectation(variant='solid fuel 8', block='11a', expected_sap_resid=+4.2391, expected_cost_resid_gbp=-97.6761, expected_co2_resid_kg=-14.9661, expected_pe_resid_kwh=+2512.8796), - _CorpusExpectation(variant='solid fuel 9', block='11a', expected_sap_resid=+3.4416, expected_cost_resid_gbp=-79.3010, expected_co2_resid_kg=-8.4751, expected_pe_resid_kwh=+2427.8078), - _CorpusExpectation(variant='solid fuel 10', block='11a', expected_sap_resid=+5.1366, expected_cost_resid_gbp=-118.3539, expected_co2_resid_kg=-52.9522, expected_pe_resid_kwh=+1848.8905), - _CorpusExpectation(variant='solid fuel 11', block='11a', expected_sap_resid=+4.3479, expected_cost_resid_gbp=-100.1809, expected_co2_resid_kg=-8.8428, expected_pe_resid_kwh=+1535.5344), + _CorpusExpectation(variant='solid fuel 2', block='11a', expected_sap_resid=+4.7910, expected_cost_resid_gbp=-110.3933, expected_co2_resid_kg=-484.3578, expected_pe_resid_kwh=-2292.4679), + _CorpusExpectation(variant='solid fuel 3', block='11a', expected_sap_resid=+4.4310, expected_cost_resid_gbp=-102.0983, expected_co2_resid_kg=-1206.1483, expected_pe_resid_kwh=-2496.1951), + _CorpusExpectation(variant='solid fuel 4', block='11a', expected_sap_resid=+4.1283, expected_cost_resid_gbp=-95.1230, expected_co2_resid_kg=-714.4446, expected_pe_resid_kwh=-1097.3549), + _CorpusExpectation(variant='solid fuel 5', block='11a', expected_sap_resid=+2.7081, expected_cost_resid_gbp=-62.3977, expected_co2_resid_kg=-301.4166, expected_pe_resid_kwh=-330.8371), + _CorpusExpectation(variant='solid fuel 6', block='11a', expected_sap_resid=-7.3846, expected_cost_resid_gbp=+168.2332, expected_co2_resid_kg=-153.6470, expected_pe_resid_kwh=-1312.5322), + _CorpusExpectation(variant='solid fuel 7', block='11a', expected_sap_resid=+5.8242, expected_cost_resid_gbp=-131.0462, expected_co2_resid_kg=-758.2093, expected_pe_resid_kwh=-1638.1589), + _CorpusExpectation(variant='solid fuel 8', block='11a', expected_sap_resid=+4.2391, expected_cost_resid_gbp=-97.6761, expected_co2_resid_kg=-14.9661, expected_pe_resid_kwh=-1307.9243), + _CorpusExpectation(variant='solid fuel 9', block='11a', expected_sap_resid=+3.4416, expected_cost_resid_gbp=-79.3010, expected_co2_resid_kg=-8.4751, expected_pe_resid_kwh=-510.4162), + _CorpusExpectation(variant='solid fuel 10', block='11a', expected_sap_resid=+5.1366, expected_cost_resid_gbp=-118.3539, expected_co2_resid_kg=-52.9522, expected_pe_resid_kwh=-1315.3508), + _CorpusExpectation(variant='solid fuel 11', block='11a', expected_sap_resid=+4.3479, expected_cost_resid_gbp=-100.1809, expected_co2_resid_kg=-8.8428, expected_pe_resid_kwh=-962.4251), ) @@ -303,15 +330,21 @@ def test_heating_systems_corpus_residual_matches_pin( site_notes = ElmhurstSiteNotesExtractor(pages).extract() epc = EpcPropertyDataMapper.from_elmhurst_site_notes(site_notes) - # Act - result = calculate_sap_from_inputs( + # Act — run both cascade modes so the comparison against the + # worksheet pins is apples-to-apples per block (see module + # docstring: rating block carries SAP / cost / CO2, EPC block + # carries PE). + rating = calculate_sap_from_inputs( cert_to_inputs(epc, prices=SAP_10_2_SPEC_PRICES), ) + demand = calculate_sap_from_inputs( + cert_to_demand_inputs(epc, prices=SAP_10_2_SPEC_PRICES), + ) - sap_resid = result.sap_score_continuous - worksheet['sap_c'] - cost_resid = result.total_fuel_cost_gbp - worksheet['cost'] - co2_resid = result.co2_kg_per_yr - worksheet['co2'] - pe_resid = result.primary_energy_kwh_per_yr - worksheet['pe'] + sap_resid = rating.sap_score_continuous - worksheet['sap_c'] + cost_resid = rating.total_fuel_cost_gbp - worksheet['cost'] + co2_resid = rating.co2_kg_per_yr - worksheet['co2'] + pe_resid = demand.primary_energy_kwh_per_yr - worksheet['pe'] # Assert — each residual sits within its absolute tolerance of the # pinned value. Drift beyond tolerance fires loudly; closures land