Model/backend/documents_parser
Khalim Conn-Kowlessar 509ef4fbbf Slice S0380.78: §1x.0 shower extractor + (247a) fallback cost close cert 000565 (45)m
Two coupled fixes that together close the +903 kWh (45)m
energy-content over-count on cert 000565. Splitting them would
flip sap_score from 29 → 30 mid-fix; bundled they keep cert 000565
within rounding of the worksheet (continuous SAP residual closes
17×, from Δ +0.60 to Δ −0.035).

## 1. Elmhurst extractor — §1x.0 section-bounded "Connected" lookup

`_extract_baths_and_showers` was anchoring on the FIRST "Connected"
substring in the document via `self._lines.index("Connected")`.
Cert 000565 (4 extensions) has "Connected" appearing earlier as a
§3 building-parts wall elevation flag, so the global match landed
on a wall row; the digit-check at `num_line.isdigit()` failed
immediately on the "0.00" wall length and the shower roster came
back empty.

Both `1x.0 Baths and Showers` and `18.0 Flue Gas Heat Recovery
System` are single-occurrence section anchors in the Elmhurst
Summary PDF. Routing the "Connected" lookup through `_section_
lines(...)` bounds the search to the §1x.0 block, so multi-
extension certs no longer lose the shower roster.

## 2. SAP 10.2 §10a line (247a) — electric shower cost in fallback path

SAP 10.2 §10a (PDF p.145) worksheet line (247a):

    Energy for instantaneous electric shower(s)
                                       (64a)  × 0.01 = (247a)
    Total energy cost   (240)...(242) + (245)...(254) = (255)

Electric showers route their (64a) kWh through the "other fuel"
tariff (same column as pumps/fans (249) and lighting (250)) and
add to (255) total cost.

`calculator.py:415-470` STANDARD-tariff path consumes
`FuelCostResult` from `fuel_cost(...)` which already plumbs
`instant_shower_cost_gbp` (worksheet/fuel_cost.py:214). The
fallback scalar path at `calculator.py:489-530` (TEN_HOUR /
off-peak / zero-FuelCostResult certs) was missing the electric-
shower term entirely. Cert 000565 (Dual-meter TEN_HOUR + 1
electric shower) trips this branch — fix #1 surfaced the
£93/yr under-count and the sap_score regression that followed.

Fix: add
    electric_shower_cost = inputs.electric_shower_kwh_per_yr
                         × inputs.other_fuel_cost_gbp_per_kwh
into the `total_cost = max(0, ...)` sum, parallel to the existing
`electric_shower_co2` and `electric_shower_pe` flows already
present in the CO2 (line 552) and PE (line 619) sections.

## Why bundled

SAP 10.2 Appendix J §J2 step 2a (PDF p.81) routes baths via
`N_bath = 0.13 N + 0.19` when a shower is present, `0.35 N + 0.50`
when no shower is present — a 2.67× swing in (42b)m that
compounds into (45)m energy content. The extractor fix closes
(45)m to EXACT (1286.3266 = 1286.3266 ✓), but the cascade's
electric-shower kWh stream becomes load-bearing for cost — and
the fallback path was silently dropping it. Without fix #2,
sap_score regressed from 29 → 30 (cost too low → ECF too low →
SAP rating too high).

## Cert 000565 movements at HEAD (post-S0380.77 → post-this slice)

| Field                | Pre-slice |  Post-slice |  Worksheet | Pre-Δ   | Post-Δ  |
|----------------------|----------:|------------:|-----------:|--------:|--------:|
| sap_score            |        29 |          28 |         29 |       0 |      −1 |
| sap_score_continuous |   29.1090 |     28.4735 |    28.5087 |  +0.60  | **−0.035** |
| ecf                  |    5.3256 |      5.3904 |     5.3866 |  −0.06  | **+0.004** |
| total_fuel_cost_gbp  |   4627.10 |     4683.39 |    4680.26 | −53.16  | **+3.13** |
| co2_kg               |    6616.0 |      6480.6 |     6447.6 | +168.4  |  +32.94 |
| hot_water_kwh        |    5154.0 |      4014.6 |     3755.0 | +1399   |  +259.6 |
| space_heating_kwh    |   58725.8 |     58793.0 |    59008.4 | −282.6  | −215.4  |
| main_heating_fuel    |   34544.6 |     34584.1 |    34710.8 | −166.2  | −126.7  |
| (45)m sum            |  2189.38  |  **1286.33**|  1286.3266 |  +903   |    0    |

The integer sap_score = 28 vs worksheet = 29 is a rounding-
boundary artifact: continuous SAP at 28.4735 rounds DOWN, just
0.035 below the 28.5 threshold. The remaining +259 kWh HW pin
over-count traces to the still-open (56)m storage loss over-count
+ missing (57)m solar-storage adjustment (slice C per the
handover) — closing that pulls continuous SAP back above 28.5 and
restores integer 29.

## Tests

- `test_summary_000565_extractor_finds_electric_shower_in_section_1x_0`
  (test_summary_pdf_mapper_chain.py) — pins extractor finds the
  Electric shower in §1x.0 even with §3 building-parts "Connected"
  collisions earlier in the document.
- `test_total_fuel_cost_includes_247a_electric_shower_in_fallback_path`
  (test_calculator.py) — pins `total_fuel_cost_gbp` rises by
  exactly `kwh × other_fuel_cost` when `electric_shower_kwh_per_yr`
  is non-zero in the fallback path.

Test baseline: 547 → 570 pass (+3 new tests across the 4 modified
files + indirect knock-ons in golden fixtures); 9 → 10 expected
`test_sap_result_pin[000565-*]` fails (now includes the integer
`sap_score` until slice C closes the remaining +259 kWh HW
residual). Pyright net-zero on all 4 touched files (50 baseline =
50 after).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-29 21:32:13 +00:00
..
handler address JTK review comments 2026-04-20 15:11:17 +00:00
tests Slice S0380.78: §1x.0 shower extractor + (247a) fallback cost close cert 000565 (45)m 2026-05-29 21:32:13 +00:00
__init__.py Map to RdSapSiteNotes from site notes JSON 🟥 2026-04-16 13:54:03 +00:00
db_writer.py include updating epc_property_data to pashub to ara workflow 2026-04-29 09:55:14 +00:00
elmhurst_extractor.py Slice S0380.78: §1x.0 shower extractor + (247a) fallback cost close cert 000565 (45)m 2026-05-29 21:32:13 +00:00
extractor.py Handle wall thickness "Unmeasurable" 🟩 2026-04-30 16:41:16 +00:00
local_runner.py update local runner to work for elmhurst 2026-04-24 14:01:36 +00:00
parser.py load ecmk site notes to db 2026-04-29 11:20:47 +00:00
pdf.py update local runner to work for elmhurst 2026-04-24 14:01:36 +00:00