docs: SPEC_COVERAGE PCDB integration row + slice progress + gap-list update

Updates the Prioritised gap list item 1 narrative: Table 105 (gas/oil boilers) integration done; remaining = Table 362 heat pumps + Appendix N cascade, equation D1 monthly water heating, Tables 313/353/391/506 ancillaries, condensing-boiler Ecodesign corrections.

Adds a PCDB slice progress table: ETL parser + 8-table JSONL output (`fe04cd3a`), runtime lookup module (`23678228`), cert_to_inputs precedence cascade with widened golden tolerance (`a104dd55`).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Khalim Conn-Kowlessar 2026-05-21 09:51:10 +00:00
parent a104dd559a
commit e63516cb26

View file

@ -2,7 +2,7 @@
Tracks which sections of the SAP 10.2 specification are implemented in `packages/domain/src/domain/sap/`. Per ADR-0009 the calculator is built from the spec, not reverse-engineered from cert data. This doc is the worksheet-driven roadmap for what remains.
Updated 2026-05-21 after §8c (slices `cf28eec4``f3797066`), §8f (`43cc16bc`), and §9a single-main slices (`2b5fc6a5``380b6781`).
Updated 2026-05-21 after §8c (slices `cf28eec4``f3797066`), §8f (`43cc16bc`), §9a single-main slices (`2b5fc6a5``380b6781`), and PCDB Table 105 integration (`fe04cd3a``a104dd55`).
The canonical SAP10.2 algorithm lives in [`2026-05-19-17-18 RdSap10Worksheet.xlsx`](../../2026-05-19-17-18%20RdSap10Worksheet.xlsx) at the repo root — each line ref `(1)..(486)` maps to a cell. The worksheet sub-modules under `packages/domain/src/domain/sap/worksheet/` implement those line refs directly; Elmhurst worksheets validate end-to-end via `tests/_elmhurst_worksheet_*.py`.
@ -55,7 +55,7 @@ The canonical SAP10.2 algorithm lives in [`2026-05-19-17-18 RdSap10Worksheet.xls
## Prioritised gap list (by likely MAE impact)
1. **Boiler / heat-pump efficiency Manufacturer override (PCDB integration)** — `MainHeatingDetail` lodges the PCDB pointer (`main_heating_index_number`) but no scalar efficiency. With `NoOpPcdbLookup` (ADR-0009 grill outcome #1) still in place, `cert_to_inputs` falls back to the SAP10 Table 4a category default (typically 0.80 for gas boilers, SCOP 2.30 for heat pumps) on every cert. Per [ADR-0010 §4](../adr/0010-sap10-calculator-spec-target-and-validation.md#4-pcdb-integration-is-promoted-from-session-c-to-a-prerequisite) this accounts for ~19 SAP points of MAE on heat-pump certs and most per-cert variance on the 78 % of gas-boiler certs lodging `main_heating_data_source=1` (PCDB-typical 0.880.94 vs 0.80 default). Directly visible on 000490 e2e: `inputs.main_heating_efficiency = 0.80` vs an unverified PDF Manufacturer-declared figure that surfaced from a previous agent's notes — the actual PDF value cannot be confirmed without PCDB. Closing requires a real PCDB CSV ingest + `PcdbLookup` Protocol impl + precedence wiring in `cert_to_inputs._main_heating_efficiency` and `_water_efficiency_with_category_inherit`. Promoted to prerequisite under ADR-0010, not a section-sweep slice. **§9a ALL_FIXTURES PDF-derived LINE_206/(211)/(215) pinning is blocked on this** — until PCDB lands, §9a conformance is at the synthetic + cert-round-trip level only (no PDF cross-check).
1. **Boiler / heat-pump efficiency Manufacturer override (PCDB integration) — Table 105 done; Table 362 heat pumps + equation D1 monthly water + Appendix N HP factor remain.** Gas/oil boilers (Table 105) now flow via PCDB: when a cert lodges `main_heating_index_number`, `domain.sap.tables.pcdb.gas_oil_boiler_record(...)` is consulted and the PCDB winter efficiency overrides `seasonal_efficiency(...)` for space heating + summer overrides Table 4a scalar for water heating (per Appendix D2.1). Two of the six golden corpus PCDB-listed certs drifted +1 SAP / -1.5 kWh/m² PE under the spec-faithful override (tolerance widened accordingly). **Remaining**: (a) Table 362 heat pumps still resolve via `seasonal_efficiency(main_category=4)` → 2.30 SCOP fallback; Appendix N in-use factor (0.95) + MCS factor (×1.39 GSHP) + design-flow-temp adjustment all deferred. (b) Equation D1 monthly water heating cascade (currently scalar approximation); ~single-digit-percent HW kWh under-precision for combi boilers. (c) Table 313 FGHRS, Table 353 WWHRS, Table 391 storage heaters, Table 506 HIU records are parsed into JSONL but unconsumed — wait for first cert that needs them.
2. **Table 11 Secondary heating allocation** — most boiler-main certs allocate 10% of space heating to a secondary system (often a less-efficient room heater on a different fuel). We model 0%. Likely +1-2 SAP-point bias on affected certs.
3. **Wind-shelter factor on infiltration** (§2 worksheet lines 19-21) — multiplies infiltration by `1 - 0.075 × sheltered_sides`. We have no shelter input; assume 2 sheltered sides default. Net effect on infiltration ACH probably ~10%.
4. **Table 12a high-rate fraction for off-peak dwellings** — we currently bill 100% of E7 space heating at the low rate. Real spec says e.g. heat pumps on 7h tariff at 80% high-rate. Affects ~5% of certs.
@ -294,3 +294,20 @@ Status now: 100-cert MAE 4.49, 300-cert MAE 5.45, bias near zero (±0.2). Worksh
4. **Table 4f pumps/fans breakdown** (230a)-(230h) → (231) — replace opaque `CalculatorInputs.pumps_fans_kwh_per_yr` scalar with per-source sub-lines. Likely a separate sweep slice once aux electricity becomes load-bearing for ranking.
5. **Appendix Q items** (236)/(237) — placeholder until a Q-item cert lands.
6. **(238) total delivered energy on SapResult** — promote from `intermediate` dict when §10a or §13 requires it as a named output.
## PCDB — slice progress (BRE pcdb10.dat ingestion)
| Stage | Description | Status | Commit |
|---|---|---|---|
| ETL parser + 8 tests | `domain.sap.tables.pcdb.parser`: typed `GasOilBoilerRecord` + `RawPcdbRecord`. Ground-truth verified against ncm-pcdb.org.uk for Baxi 000098 / Potterton 000619 / Saunier 000732. Handles latin-1 encoding (degree-sign in addresses), `'obsolete'` status string, `'>70kW'` range indicator. | ✅ | `fe04cd3a` |
| ETL `run_etl` writes 8 JSONL files | One newline-delimited JSON file per table (105 typed; 122/143/313/353/362/391/506 raw). 17MB total. Runnable via `PYTHONPATH=packages/domain/src python -m domain.sap.tables.pcdb.etl`. Idempotent; commit JSONL alongside source `pcdb10.dat`. | ✅ | `fe04cd3a` |
| Runtime lookup `gas_oil_boiler_record(pcdb_id)` | `domain.sap.tables.pcdb` loads Table 105 NDJSON at import; ~50ms one-off, O(1) lookups thereafter. Returns None for unknown PCDB IDs → caller falls back to Table 4a/4b cascade. | ✅ | `23678228` |
| cert_to_inputs precedence (Table 105 only) | Appendix D2.1: PCDB winter overrides `main_heating_efficiency`; PCDB summer overrides `water_efficiency` scalar. Heat-network DLF override still wins where applicable. None of the 6 Elmhurst fixtures lodge a PCDB pointer; corpus golden certs that do see real efficiency changes (golden tolerance widened ±5 → ±7). | ✅ | `a104dd55` |
| Heat pump Appendix N cascade via Table 362 | Apply Appendix N in-use factor (×0.95), MCS installation factor (×1.39 for GSHP MCS-installed), design flow temperature adjustment. Replace SCOP 2.30 Table 4a fallback for `main_category=4`. | ⏸ deferred (typed Table 362 parser + Appendix N cascade) | — |
| Equation D1 monthly water heating cascade | Spec D2.1 (2): η_water_monthly = (Q_space + Q_water) / (Q_space/winter + Q_water/summer). Promotes water_eff scalar → 12-tuple. Refactor of `_hot_water_fuel_kwh_per_yr`. | ⏸ deferred (single-digit-% HW kWh precision for combi boilers) | — |
| Solid fuel boiler precedence via Table 122 | PCDB override for `main_category=3` (solid fuel) — typed parser + wiring. | ⏸ deferred | — |
| Micro-CHP precedence via Table 143 | PCDB override for micro-CHP systems (Appendix N path). | ⏸ deferred | — |
| Ancillary cascades: Table 313 (FGHRS), 353 (WWHRS), 391 (HHR storage), 506 (HIU) | Typed parsers + cert-side wiring per spec rules. JSONL files exist; consumers don't. | ⏸ deferred | — |
| Table D1/D2/D3 condensing-boiler control-class corrections | Apply Ecodesign control-class + design-flow-temp adjustments on top of PCDB winter efficiency. Requires cert lodgement of control class + flow temp. | ⏸ deferred (no fixture lodges these yet) | — |
**Impact:** Closes the ADR-0010 §4 prerequisite for gas/oil boilers. Heat pumps remain on SCOP 2.30 Table 4a fallback — ~19 SAP-point MAE on HP certs per ADR-0010 §4 persists until the Table 362 + Appendix N slice lands. §9a ALL_FIXTURES PDF-derived LINE_206/(211)/(215) pinning is *still blocked*: the 6 Elmhurst fixtures don't lodge `main_heating_index_number`, so adding PDF-grounded efficiency pins requires either (a) verifying each fixture's actual boiler against ncm-pcdb.org.uk + adding the PCDB ID to fixture builders, or (b) waiting for a real-corpus golden cert to validate against.