Commit graph

6 commits

Author SHA1 Message Date
Khalim Conn-Kowlessar
58a9547210 Slice S0380.168: Bio-liquid mapper extensions + Table 32 FAME price flip
Mapper extensions (`_ELMHURST_MAIN_HEATING_EES_TO_FUEL_CODE`):

  "BFD": 71,  # HVO        — corpus variant oil 2 (SAP 127)
  "BXE": 73,  # FAME       — corpus variant oil 3 (SAP 128)
  "BXF": 73,  # FAME alt   — corpus variant oil 4 (SAP 129)
  "BZC": 76,  # Bioethanol — corpus variant oil 5 (SAP 126)
  "B3C": 75,  # B30K       — corpus variant oil 6 (SAP 126)

`_ELMHURST_MAIN_FUEL_TO_SAP10` water-side labels:

  "Bio-liquid HVO from used cooking oil": 71,
  "Bio-liquid FAME from animal/vegetable oils": 73,
  "Bioethanol": 76,
  "B30K": 75,

Values are direct Table 32 codes (the bio-liquid codes 71/73/75/76
don't collide with any API enum value so they pass through
`unit_price_p_per_kwh` etc. unchanged). Spec: SAP 10.2 Table 12
(PDF p.189) notes (d)/(e)/(f).

Pre-slice all 5 oil 2-6 variants raised `MissingMainFuelType` per
S0380.132. Post-mapper-extension cascade results:

  oil 2 (HVO):       SAP / cost / CO2 / PE all EXACT first try ✓
  oil 5 (Bioethanol): SAP / cost / CO2 / PE all EXACT first try ✓
  oil 3 (FAME):      SAP +17.34, cost −£398
  oil 4 (FAME alt):  SAP +16.06, cost −£367
  oil 6 (B30K):      SAP +3.05,  cost −£70

Slice S0380.131 had left a deferred TODO in `table_32.py` for FAME
code 73 ("worksheet 7.64 vs spec 5.44 — flipping has no measurable
cascade effect today, deferred until a cert that exercises it
surfaces"). Now exercised — flipping `73: 5.44 → 7.64` closes 85 %
of the oil 3/4 cost gap:

  oil 3 (FAME):      SAP +17.34 → +2.59,  cost −£398 → −£62
  oil 4 (FAME alt):  SAP +16.06 → +2.56,  cost −£367 → −£57

The Elmhurst-engine canonical 7.64 ↔ spec PDF 5.44 divergence is the
same pattern S0380.131 applied to heating oil (code 4: 7.64 → 5.44)
per [[feedback-software-no-special-handling]].

Remaining residuals on oil 3 / oil 4 / oil 6 are cascade-side
(HW kWh under by ~250-900, SH demand small diff, CO2/PE blend
artifacts) — pinned at observed values as forcing functions for
follow-up slices. Open fronts:
  - HW kWh discrepancy on FAME (cascade applies different efficiency
    path than Elmhurst for SAP codes 128/129)
  - B30K (oil 6) Δcost −£70 with prices matching: SH/HW kWh gap

Closures `oil 2` / `oil 5`: ±0.0000 on all 4 metrics. Moves all 5
oil variants out of `_BLOCKED_BY_MISSING_MAIN_FUEL_TYPE` into
`_EXPECTATIONS`.

Blocked tier now: 6 variants (community heating × 5, no system).
Cascade-OK tier: 32 variants (up from 30), 30 EXACT + 3 (oil 3/4/6)
pinned with non-zero residuals + 1 (pcdb 1 SH residual closed in
S0380.165).

Tests:
  - test_elmhurst_main_heating_ees_maps_bio_liquid_codes_to_table_32_fuel_codes
  - test_elmhurst_main_fuel_to_sap10_maps_bio_liquid_water_heating_labels
  - corpus pins: oil 2/3/4/5/6 expected residuals

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-02 10:14:10 +00:00
Khalim Conn-Kowlessar
1b1f45b679 Slice S0380.148: Table 4f — liquid fuel boiler flue fan and fuel pump (100 kWh/yr)
SAP 10.2 Table 4f (PDF p.174) "Electricity for fans, pumps and other
auxiliary uses" row:

  Liquid fuel boiler — flue fan and fuel pump   100 kWh/yr  c) d)

Note c): "Applies to all liquid fuel boilers that provide main heating,
but not if boiler provides hot water only. Where there are two main
heating systems include two figures from this table."

Pre-slice the cascade's `_table_4f_additive_components` only wired:
  - (230a) MEV / MVHR
  - (230e) Main 2 gas-boiler flue fan (45 kWh)
  - (230g) Solar HW pump

The liquid-fuel sibling row was missing — oil 1 worksheet (230d) and
oil pcdb 3 worksheet (230d) both lodge 100 kWh/yr "oil boiler pump"
that the cascade was silently skipping.

Implementation:

  - Add `_LIQUID_FUEL_CODES = frozenset({4, 71, 73, 75, 76})` and new
    `is_liquid_fuel_code(fuel_code)` helper in
    `domain/sap10_calculator/tables/table_32.py`. Mirror of
    `is_electric_fuel_code` — routes through `_to_table_32_code`
    normalisation so Elmhurst-derived Table 32 codes (e.g. code 23
    = bulk wood pellets, solid) don't collide with API enum codes
    (where 23 = B30D community).
  - Extend `_table_4f_additive_components` to add 100 kWh for Main 1
    when `is_liquid_fuel_code(main.main_fuel_type)` returns True
    (`isinstance(int)` guard for the `Union[int, str]` field). Mirror
    the same gate for Main 2 per Note c) "Where there are two main
    heating systems include two figures".
  - LPG is GAS (Table 4b/4f convention, Ecodesign classification) —
    `_LIQUID_FUEL_CODES` deliberately excludes 2/3/5/9 LPG codes.

Cascade impact across heating-systems corpus:

  | Variant   | SAP Δ       | Cost Δ      | PE Δ        |
  |-----------|-------------|-------------|-------------|
  | oil 1     | +1.18→+0.60 | -£27→-£14   | -276→-124   |
  | oil pcdb 1| +0.42→-0.15 |  -£10→+£3.4 |  -84→+67    |
  | oil pcdb 2| +0.42→-0.15 |  -£10→+£3.4 |  -84→+67    |
  | oil pcdb 3| +1.16→+0.59 | -£27→-£14   | -271→-120   |
  | pcdb 1    | +0.57→-0.03 | -£13→+£0.6  | -109→+42    |

Cohort closures: pcdb 1 EXACT (-0.03), oil pcdb 1/2 closed to -0.15.

Golden fixtures impact:

  - cert 0240 (dual-main oil combi 130): SAP integer 73→72 (resid
    +0→-1), PE +1.02→+2.52, CO2 +0.11→+0.14. Dual-main certs add
    2 × 100 = 200 kWh aux per Note c). Cert's published SAP 73
    suggests the dual-main Q_space split (main_heating_fraction)
    may also need wiring — slice candidate.
  - cert 0390 (Firebird PCDF 9005 oil combi): PE -28.50→-28.08
    (CLOSER to zero), CO2 -2.75→-2.73 (CLOSER to zero), SAP +7
    unchanged.

Test:
  test_sap_table_4f_liquid_fuel_boiler_flue_fan_and_fuel_pump_adds_
  100_kwh — asserts oil pcdb 3 inputs.pumps_fans_kwh_per_yr ≥ 230
  (130 base + 100 liquid fuel boiler aux).

Extended handover suite: 891 pass, 0 fail. Pyright net-zero (44=44).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-06-01 08:53:23 +00:00
Khalim Conn-Kowlessar
4d004790db Slice S0380.136: route _is_electric_main / _is_electric_water via the canonical T32-first normaliser (dual-fuel closure)
`_is_electric_main` and `_is_electric_water` hand-rolled a literal set
check `code in {10, 25, 29}` ∪ `{30..40}` to classify a fuel code as
electricity. The set conflated two enums:

  - {10, 25, 29}    — API enum codes (epc_codes.csv row main_fuel):
                        10 = electricity (backwards compat)
                        25 = electricity (community)
                        29 = electricity (not community)
  - {30, 31, ..., 40} — Table 32 codes (RdSAP 10 spec p.95):
                          30 = standard tariff
                          31/32 = 7-hour low/high
                          33/34 = 10-hour low/high
                          35 = 24-hour heating
                          38/40 = 18-hour high/low

API enum codes 1-29 collide with Table 32 codes 1-29 for unrelated
fuels — API 10 = "electricity" vs Table 32 10 = "dual fuel (mineral +
wood)". S0380.135's EES dispatch sets `main_fuel_type` to Table 32
codes (BDI → 10 for dual fuel), so a dual-fuel main was silently
mis-classified as electric. The `_space_heating_fuel_cost_gbp_per_kwh`
tariff branch then re-routed solid fuel 6's space heating cost through
the 18-hour-low electric rate (5.50 p/kWh) instead of dual-fuel 3.99
p/kWh — solid fuel 6 SAP residual −7.38 → −11.37 in S0380.135.

The fix promotes the existing `table_32._is_electric_code` to public
`is_electric_fuel_code` and routes both `_is_electric_main` and
`_is_electric_water` through it. The canonical helper normalises a
fuel code via T32-first then API-translate fallback (same convention
as `unit_price_p_per_kwh`), so a Table-32-code-10 dual-fuel main
classifies as non-electric correctly.

Subtle behaviour change: API enum code 25 ("electricity (community)")
maps via API_FUEL_TO_TABLE_32 to Table 32 code 41 ("heat from electric
heat pump (community)") which is a heat network billed at the heat-
network rate (4.24 p/kWh single rate), not at the off-peak electric
tariff. Pre-S0380.136 the literal-set check would have treated this
as direct electric and applied the Table 12a high/low-rate split —
that was wrong; community heat networks don't have an off-peak split.
The new canonical helper correctly excludes code 41 from
_ELECTRIC_FUEL_CODES.

Heating-systems corpus impact:

  solid fuel 6 (Dual Fuel Anthracite Wood, SAP 160):
    ΔSAP  −11.3731 → +1.9493   (now in cluster with other solid-fuel)
    Δcost +£268.44 → −£44.91
    ΔPE   unchanged (PE wasn't affected by the cost mis-routing)

No other corpus variants moved — none have `main_fuel_type` in the
ambiguous API/T32 collision range that was previously mis-classified.

Extended handover suite: 879 pass / 0 fail (+2 from new AAA tests
covering both `_is_electric_main` and `_is_electric_water` dual-fuel
non-electric classification + API code 29 → electric / API code 25 →
heat-network non-electric semantics).

Pyright net-zero on touched files (43 → 43).

No golden fixture impact — no golden cert lodges `main_fuel_type=10`
(dual fuel) on the cascade path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-31 13:37:14 +00:00
Khalim Conn-Kowlessar
14eee259b4 Slice S0380.131: flip Table 32 heating-oil price 7.64 → 5.44 (empirical)
The published RdSAP 10 Specification 10-06-2025 PDF Table 32 (p.95)
lists heating oil at 7.64 p/kWh. Two independent operational sources
both use 5.44 p/kWh for the same fuel:

  - Elmhurst P960 worksheets across all five oil-fired variants in
    `sap worksheets/heating systems examples/` (oil 1, oil pcdb 1/2/3,
    pcdb 1) lodge 5.4400 p/kWh on (240) "Space heating - main system 1"
    and (247) "Water heating (other fuel)" for every "FuelType: Heating
    oil" worksheet.
  - The gov.uk EPC register's lodging software back-solves to ~5.48
    p/kWh from cert 0240-0200-5706-2365-8010's lodged SAP 73 (oil + PV
    detached, age J). With heating-oil at 5.44 in the cascade this cert
    closes to ΔSAP = 0 exactly against its lodged value.

The BRE technical papers (`docs/specs/sap10 technical papers/`) carry
no Table 32 errata or fuel-price update, so the change is grounded in
empirical cross-source evidence rather than a spec citation — the
worksheet PDF is the source of truth per the project convention.

Post-flip residuals:

  Heating-systems corpus (cascade − worksheet ΔSAP_c):
    oil 1        −9.7030 → +2.6578
    oil pcdb 1  −11.6343 → +0.4239  ← within 1 SAP of closure
    oil pcdb 2  −11.6343 → +0.4239
    oil pcdb 3  −10.8674 → +1.1597
    pcdb 1       −9.4083 → +6.9521  ← largest remaining oil-cohort gap

  Golden fixtures (cascade − lodged SAP):
    0240-0200-5706-2365-8010  resid −10 → +0  ← EXACT closure
    0390-2954-3640-2196-4175  resid  −6 → +7  ← oil-price bug was
                                                 masking +13 SAP of
                                                 opposite-direction
                                                 cascade gaps; now
                                                 exposed for follow-up

PE / CO2 residuals are unaffected by the unit-price flip (cost-only
change). The 41-variant corpus regression guard (S0380.129) holds; all
other golden cohorts pass unchanged. Extended handover suite: 874 pass.

Bio-FAME (code 73) shows the inverse divergence on oil 3/4 worksheets
(worksheet 7.64 vs spec 5.44 — possible row-swap typo in the spec PDF)
but flipping it has no measurable cascade effect today, so deferred
until a cert that exercises it surfaces.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-31 09:21:29 +00:00
Khalim Conn-Kowlessar
a7b08a4e8f refactor: move docs/sap-spec/ contents into domain/sap10_calculator/
Locality of reference — SAP-specific docs, specs, and runtime data
now live alongside the calculator that consumes them, mirroring the
prior packages→domain layout moves.

Move targets:

- Narrative MDs → domain/sap10_calculator/docs/
    NEXT_AGENT_PROMPT.md, HANDOVER_NEXT.md, SAP_CALCULATOR.md
- Spec PDFs → domain/sap10_calculator/docs/specs/
    RdSAP 10 Specification 10-06-2025.pdf
    PCDF_Spec_Rev-06b_12_May_2021.pdf
    sap-10-2-full-specification-2025-03-14.pdf
    sap-10-3-full-specification-2026-01-13.pdf
- PCDB runtime data → domain/sap10_calculator/tables/pcdb/data/
    pcdb10.dat (8.3MB) + 7× pcdb_table_*.jsonl (18MB total)

Path code rewrites (load-bearing):

- tables/pcdb/__init__.py: replaced parents[4]/'docs'/'sap-spec' with
  Path(__file__).resolve().parent/'data' for Table 105 JSONL loading.
- tables/pcdb/postcode_weather.py: same rebase for the pcdb10.dat path
  read by _postcode_climate_table().
- tables/pcdb/etl.py __main__: same rebase for the manual ETL invocation
  (source + output_dir both now point inside the package).
- tests/test_pcdb_etl.py: _PCDB_DAT_PATH now derives from
  parents[1]/'tables'/'pcdb'/'data' (was parents[3]/'docs'/'sap-spec').

Citation rewrites:

- 12 .py docstrings and 4 .md docs (ADRs + READMEs + narrative docs)
  had `docs/sap-spec/<file>` strings rewritten to their new locations.
- Two cases where the catch-all sed misfired (an ADR-0009 line about a
  PCDB extract; the pcdb __init__.py docstring about ETL output) were
  hand-corrected to point at tables/pcdb/data/ rather than docs/specs/.

docs/sap-spec/ is now empty (will be removed in a follow-up sweep or
left as a vestigial empty dir for future repurposing). ADRs 0009 and
0010 remain at docs/adr/ — they're part of the chronological
cross-cutting decision log, not calculator-specific narrative.

Verified:

- Calculator's 1e-4 production gate
  (test_api_001479_full_chain_sap_matches_worksheet_pdf_exactly) GREEN.
- Wider sweep (domain/sap10_calculator/ + domain/sap10_ml/): 1654
  passed / 20 failed — exact pre-move baseline. All 20 failures
  pre-existing (10 hand-built skeleton + 4 cohort chain + 6 cohort
  diff).
- Pyright net-zero on the 4 touched runtime/test files (0 errors)
  and unchanged on heat_transmission.py (13) / cert_to_inputs.py (35) /
  mapper.py (33).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-26 13:17:18 +00:00
Khalim Conn-Kowlessar
29ac35ccbe refactor: lift-and-shift packages/domain/src/domain/sap → domain/sap10_calculator
Migration of the SAP 10.2 calculator package from the uv-workspace
src-layout (`packages/domain/src/domain/sap`) to the root-level layout
(`domain/sap10_calculator`), matching the pattern already used by
`domain.addresses` / `domain.tasks` / `domain.postcode`.

Changes:

- `git mv packages/domain/src/domain/sap → domain/sap10_calculator`
  (92 files; git auto-detected all as renames so blame/history is
  preserved).
- Subpackage rename: `domain.sap` → `domain.sap10_calculator`. 48
  Python files rewritten (`from domain.sap.X` → `from domain.sap10_
  calculator.X`); zero remaining `domain.sap` refs after the sed pass.
- Path-string updates: 3 .py files (test fixtures + xlsx loader) +
  6 markdown docs (CONTEXT.md, 2 ADRs, 3 sap-spec docs, sap10_
  calculator/README.md) had hard-coded `packages/domain/src/domain/
  sap/...` paths rewritten to `domain/sap10_calculator/...`.
- `Path(__file__).parents[N]` rebasing: the old tree was 3 levels
  deeper than the new one (`packages/domain/src/`), so 4× `parents[7]`
  became `parents[4]` and 1× `parents[6]` became `parents[3]` across
  `tables/pcdb/{__init__.py, postcode_weather.py, etl.py}`,
  `worksheet/tests/_xlsx_loader.py`, and `tests/test_pcdb_etl.py`.
- PEP 420 namespace package: deleted both `domain/__init__.py`
  (root + workspace, both load-bearing only as empty/docstring) so
  Python combines `domain.sap10_calculator` (root) and `domain.ml`
  (workspace) into one namespace package. Confirmed via
  `domain.__path__ == ['/workspaces/model/domain',
  '/workspaces/model/packages/domain/src/domain']`. Without this,
  the root `domain/__init__.py` shadowed the workspace one and
  `domain.ml` was unreachable.

Verified:

- Full sweep (`backend/documents_parser/tests/test_summary_pdf_
  mapper_chain.py + domain/sap10_calculator/worksheet/tests/test_
  e2e_elmhurst_sap_score.py + domain/sap10_calculator/rdsap/tests/
  test_golden_fixtures.py`): 99 passed / 19 failed — exact same
  counts as pre-refactor. All 19 failures pre-existing (9 hand-built
  001479 + 6 cohort diff + 4 cohort chain non-spec).
- Wider sweep (all sap10_calculator + domain.ml): 1654 passed /
  20 failed (the +1 vs the focused sweep is the pre-existing
  `test_roof_insulated_assumed_with_ni_thickness_uses_50mm_per_
  section_5_11_4` which was already failing on the previous baseline).
- Pyright net-zero on the three load-bearing baselines:
  `heat_transmission.py` 13, `cert_to_inputs.py` 35, `mapper.py` 33.

Lift-and-shift only — no semantic renames (`Sap10Calculator` stays
`Sap10Calculator`), no testpaths edits in pytest.ini (sap tests
continue to be invoked by explicit pytest paths).

Note: `domain.ml` still lives at `packages/domain/src/domain/ml/`.
Migrating it would close out the dual-`domain/` layout but is
out of scope for this commit.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-26 12:22:37 +00:00
Renamed from packages/domain/src/domain/sap/tables/table_32.py (Browse further)