mirror of
https://github.com/Hestia-Homes/Model.git
synced 2026-06-08 11:17:27 +00:00
docs: handover + next-agent prompt post S0380.115..124
10 spec-cited slices closed this session: .115 — fixture ECF pin typo .116 — RdSAP 10 §15 A_RR_shell rounding (cert 000565 truly exact) .117 — re-pin golden PE residuals for 0240 + 6035 .118 — cohort LINE_xx pins → 1e-4 + §15-aware RR test expecteds .119 — §5 test EPC builder propagates sap_roof_windows .120 — RdSAP 10 §5.11.4 NI vs explicit-0 roof discriminator .121 — floor_construction code 4 → "Solid" (basement cert 0712) .122 — tighten test_ventilation tolerances .123 — pin Table U5 share-column solar fluxes at exact equality .124 — tighten dimensions + rating arithmetic pins Extended handover suite at HEAD `1e69bd39`: 775 pass, 0 fail. Handover documents: - HANDOVER_POST_S0380_124.md — full state + cert 0240 hypothesis ranking - NEXT_AGENT_PROMPT_POST_S0380_124.md — two-task brief (0240 cost-cascade diagnosis + golden-corpus audit awaiting user's same-property heating-variant Elmhurst fixtures). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
1e69bd3979
commit
8904ec090b
2 changed files with 477 additions and 0 deletions
243
domain/sap10_calculator/docs/HANDOVER_POST_S0380_124.md
Normal file
243
domain/sap10_calculator/docs/HANDOVER_POST_S0380_124.md
Normal file
|
|
@ -0,0 +1,243 @@
|
|||
# Handover — post Slices S0380.115..124
|
||||
|
||||
Branch: `feature/per-cert-mapper-validation`. **HEAD `1e69bd39`**.
|
||||
Predecessor: [`HANDOVER_POST_S0380_114.md`](HANDOVER_POST_S0380_114.md).
|
||||
|
||||
## TL;DR
|
||||
|
||||
10 spec-cited slices landed on top of `cc70e559`:
|
||||
|
||||
| Slice | Commit | Scope |
|
||||
|---|---|---|
|
||||
| **S0380.115** | `d0268a5b` | Fixture ECF pin transcription typo 5.3866 → 5.3868 (PDF line 593) |
|
||||
| **S0380.116** | `f2e8b657` | A_RR_shell rounded to 2 d.p. per RdSAP 10 §15 (p.66) — cert 000565 truly exact |
|
||||
| **S0380.117** | `854b8884` | Re-pin golden PE residuals for 0240 + 6035 |
|
||||
| **S0380.118** | `55a29f5a` | Cohort LINE_xx pins 0.01/0.1 → 1e-4 + §15-rounded RR test expecteds |
|
||||
| **S0380.119** | `a77f1a28` | Propagate sap_roof_windows in §5 test EPC builder (closed 000516 lighting) |
|
||||
| **S0380.120** | `f0305d54` | Distinguish "NI" string from explicit int(0) roof_insulation_thickness per RdSAP 10 §5.11.4 |
|
||||
| **S0380.121** | `e698fabc` | Map floor_construction code 4 → "Solid" (basement cert 0712) |
|
||||
| **S0380.122** | `9f0dd645` | Tighten test_ventilation tolerances (17 hand-crafted + 10 cohort pins) |
|
||||
| **S0380.123** | `49f87160` | Pin Table U5 share-column solar fluxes at exact equality |
|
||||
| **S0380.124** | `1e69bd39` | Tighten dimensions + rating arithmetic pins |
|
||||
|
||||
**Extended handover suite at HEAD `1e69bd39`: 775 pass, 0 fail.**
|
||||
|
||||
Cert 000565 is now **TRULY EXACT** — every SAP-result pin ≤5e-5 vs U985 PDF display.
|
||||
|
||||
## Two-task handover for the next agent
|
||||
|
||||
### Task 1: Close cert 0240's remaining residual
|
||||
|
||||
Cert 0240's mapper gap was largely closed by the §5.11.4 fix (Slice 120),
|
||||
but **a SAP-rating residual of −10 persists** alongside near-zero PE/CO2:
|
||||
|
||||
| Pin | Before Slice 120 | After Slice 120 (now) |
|
||||
|---|---:|---:|
|
||||
| `expected_sap_resid` | −14 | **−10** |
|
||||
| `expected_pe_resid_kwh_per_m2` | +12.4933 | **+0.0542** |
|
||||
| `expected_co2_resid_tonnes_per_yr` | +0.6957 | **+0.0626** |
|
||||
|
||||
PE and CO2 are essentially closed (sub-0.1 magnitude). The SAP residual
|
||||
−10 means **cascade COST > lodged COST** while energy demand and CO2
|
||||
match. The driver is in the fuel-cost / ECF path, not the heat-loss
|
||||
path.
|
||||
|
||||
#### Cert 0240 shape
|
||||
|
||||
- Detached house (property_type=0, built_form=1), TFA 202 m², stone walls
|
||||
- `walls`: "Sandstone, as built, insulated (assumed)" — solid stone
|
||||
- `roofs`: "Pitched, 400+ mm loft insulation" — Table 16 row 400+ → U≈0.11
|
||||
- `floors`: "Solid, insulated (assumed)" — §5.11.4 fired here too
|
||||
- `main_heating`: "Boiler and radiators, oil" — Table 4a oil boiler
|
||||
- `secondary_heating`: None
|
||||
- `solar_water_heating`: N
|
||||
- `photovoltaic_supply`: `none_or_no_details` (no PV)
|
||||
- `mains_gas`: N (off-grid oil)
|
||||
- SAP version 10.2
|
||||
|
||||
#### Hypothesis ranking
|
||||
|
||||
1. **Oil tariff routing**. SAP 10.2 Table 12 / RdSAP10 Table 32 oil
|
||||
price is 7.64 p/kWh. Cascade may be defaulting to a different
|
||||
tariff (e.g. electricity 13.19 p/kWh) for either main or secondary
|
||||
cost. Δ in cost suggests a ~1.3× over-count which is consistent
|
||||
with a mis-routed tariff.
|
||||
2. **Hot water fuel routing**. Same oil boiler does HW. If HW cost
|
||||
routes via electricity tariff rather than oil, cost over-counts.
|
||||
3. **Off-peak / 7-hour tariff (`meter_type=3`)**. The cert lodges
|
||||
`meter_type=3` (10-hour off-peak). For an oil-heated dwelling
|
||||
this means oil-for-heating + electricity-for-other on a 10-hour
|
||||
off-peak. The cascade may be applying electricity tariff to oil
|
||||
energy.
|
||||
4. **Standing-charge mishandling**. Oil has no standing charge; if
|
||||
cascade adds gas/electricity standing charge, that's £120/yr —
|
||||
could account for some of the £420 cost residual.
|
||||
|
||||
#### Approach
|
||||
|
||||
1. Probe cascade's fuel-cost breakdown for 0240 (`result.intermediate`'s
|
||||
`main_heating_cost_gbp`, `hot_water_cost_gbp`, `pumps_fans_cost_gbp`,
|
||||
`lighting_cost_gbp`, `standing_charges_gbp`).
|
||||
2. Back-solve: with cascade total cost vs lodged cost, identify which
|
||||
sub-component is over-counting.
|
||||
3. Check what oil tariff lookup the cascade uses for this cert. Trace
|
||||
via `cert_to_inputs` → `_cost_per_kwh_for_fuel`.
|
||||
4. Once the gap is localised, write an AAA test, fix per spec, re-pin
|
||||
`expected_sap_resid` to the new (smaller-magnitude) value.
|
||||
|
||||
### Task 2: Audit golden corpus for fixture-coverage gaps
|
||||
|
||||
The user has supplied additional Elmhurst Summary + worksheet PDFs for
|
||||
**the same property with multiple different heating systems**. These
|
||||
will help cover shape gaps the current cohort doesn't exercise.
|
||||
|
||||
#### Why the residuals matter
|
||||
|
||||
Top remaining golden-corpus residuals (post-Slice 120):
|
||||
|
||||
| Cert | SAP res | PE res (kWh/m²) | CO2 res (t/yr) | Shape |
|
||||
|---|---:|---:|---:|---|
|
||||
| 0240-0200-5706-2365-8010 | −10 | +0.054 | +0.063 | Detached stone, oil boiler, TFA 202 — **task 1 above** |
|
||||
| 0390-2954-3640-2196-4175 | −6 | **−26.4** | **−2.55** | TFA 360, oil + (?) PV cert |
|
||||
| 6035-7729-2309-0879-2296 | −6 | **+46.1** | **+1.05** | TFA 128 mid-terrace age A, gas combi |
|
||||
| 7536-3827-0600-0600-0276 | +1 | −7.08 | −0.19 | Gas combi |
|
||||
| 2130-1033-4050-5007-8395 | +1 | −7.50 | −0.05 | Gas combi + PV |
|
||||
|
||||
All other cohort-2 certs sit at SAP=0, sub-1 PE/CO2.
|
||||
|
||||
The biggest residuals (6035 +46 PE, 0390 −26 PE) are documented mapper
|
||||
gaps in the cert `notes:` field. Each is a real cascade-vs-API
|
||||
divergence that needs a PDF reference (Summary + worksheet) to
|
||||
diagnose.
|
||||
|
||||
#### Why deterministic-cohort fixtures help
|
||||
|
||||
The 6 cohort fixtures (000474..000516) + 000565 are the only certs
|
||||
pinned at PDF-exact precision (abs=1e-4 against U985 PDF line refs).
|
||||
The golden corpus is pinned at the **calc-vs-API-lodged** residual,
|
||||
which means we accept whatever residual the cascade produces and pin
|
||||
against it. Closing those residuals requires:
|
||||
|
||||
1. Source-of-truth worksheet PDF for the cert (currently we don't have
|
||||
one for 0390, 6035, etc.)
|
||||
2. Identify per-section cascade drift line-by-line
|
||||
3. Implement the missing spec rule
|
||||
4. Re-pin the smaller residual
|
||||
|
||||
**The user's incoming Elmhurst worksheets (same property, multiple heating
|
||||
systems) will fill specific shape gaps.** Specifically: same envelope but
|
||||
different heating → isolates the heating-cascade impact on SAP / PE / CO2
|
||||
per fuel type. This is exactly the controlled-variable test we need to
|
||||
pin oil / heat-pump / electric / heat-network cascades against PDF
|
||||
precision rather than API residual.
|
||||
|
||||
#### Approach
|
||||
|
||||
1. Wait for the user's new fixtures. Drop them into `backend/documents_parser/tests/fixtures/`
|
||||
(Summary PDFs) and `sap worksheets/` (U985 worksheet PDFs).
|
||||
2. For each variant (same property × different heating), run extractor
|
||||
→ mapper → calculator and pin against the worksheet PDF.
|
||||
3. The first cert is the e2e baseline; subsequent certs share the
|
||||
envelope so cascade differences localise to the heating subsystem
|
||||
only.
|
||||
4. Each variant becomes a new mapper-driven fixture (mirror of
|
||||
`_elmhurst_worksheet_000565.py` pattern).
|
||||
|
||||
## Test baseline at HEAD `1e69bd39`
|
||||
|
||||
```bash
|
||||
PYTHONPATH=/workspaces/model python -m pytest \
|
||||
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
|
||||
backend/documents_parser/tests/test_elmhurst_extractor.py \
|
||||
backend/documents_parser/tests/test_elmhurst_end_to_end.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_heat_transmission.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_internal_gains.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_solar_gains.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_dimensions.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_rating.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_ventilation.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_appendix_h_solar.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_mev.py \
|
||||
domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
|
||||
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
|
||||
domain/sap10_calculator/tests/test_pcdb_table_322_lookup.py \
|
||||
domain/sap10_calculator/tests/test_pcdb_table_329_lookup.py \
|
||||
--no-cov -q
|
||||
```
|
||||
|
||||
Expected: **775 pass, 0 fail**.
|
||||
|
||||
## Memories to load (in order)
|
||||
|
||||
1. `project_cert_000565_recovery_state` — full per-slice history at HEAD `1e69bd39`
|
||||
2. `feedback_sap_10_2_only_never_10_3` — **CRITICAL** — never reference SAP 10.3
|
||||
3. `feedback_spec_citation_in_commits` — quote spec text + page in commits
|
||||
4. `feedback_verify_handover_claims` — verify numeric claims against PDFs
|
||||
5. `feedback_zero_error_strict` — pyright net-zero per touched file
|
||||
6. `feedback_commit_per_slice` — one slice = one commit
|
||||
7. `feedback_aaa_test_convention` — literal `# Arrange / # Act / # Assert` headers
|
||||
8. `feedback_e2e_validation_philosophy` — abs=1e-4 pins, no rel/xfail
|
||||
9. `feedback_abs_diff_over_pytest_approx` — use `abs(x-y) <= tol` for new tests
|
||||
10. `feedback_spec_floor_skepticism` — verify "precision floor" claims against PDFs
|
||||
11. `feedback_verify_handover_claims` — same skepticism for handover narratives
|
||||
12. `feedback_golden_residuals_near_zero` — pins should shrink toward zero
|
||||
13. `feedback_worksheet_not_api_reference` — worksheet PDF is source of truth, not API EPC
|
||||
14. `reference_unmapped_sap_code` — calculator strict-raise pattern
|
||||
15. `reference_unmapped_api_code` — mapper strict-raise pattern
|
||||
16. `project_sap10_ml_deprecation` — `domain/sap10_ml/` is retiring
|
||||
|
||||
## Spec source quick-reference
|
||||
|
||||
All under `domain/sap10_calculator/docs/specs/`:
|
||||
|
||||
- **SAP 10.2 full spec**: `sap-10-2-full-specification-2025-03-14.pdf`
|
||||
- §13 + Table 12 (p.191) — fuel cost / ECF / SAP rating
|
||||
- Appendix N (p.101-107) — heat pumps
|
||||
- **RdSAP 10 spec**: `RdSAP 10 Specification 10-06-2025.pdf`
|
||||
- §5.11.4 (p.44) — retrofit roof insulation (closed in Slice 120)
|
||||
- §15 (p.66) — rounding rules (closed in Slice 116)
|
||||
- §19 Table 32 (p.95) — RdSAP10 fuel prices / CO2 / PE factors
|
||||
- **SAP 10.3** at `sap-10-3-full-specification-2026-01-13.pdf`:
|
||||
**DO NOT reference** ([[feedback-sap-10-2-only-never-10-3]])
|
||||
|
||||
## Standard workflow per slice
|
||||
|
||||
1. Read spec page + identify rule
|
||||
2. Probe cascade vs lodged values; back-solve hypothesis
|
||||
3. Write failing AAA test
|
||||
4. Implement helper / cascade change
|
||||
5. Verify test passes
|
||||
6. Run handover suite (above command)
|
||||
7. Check pyright on touched files — net-zero from baseline (`git stash` + re-run pyright)
|
||||
8. Commit with spec citation + verbatim quote + `Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>`
|
||||
9. Update `project_cert_000565_recovery_state` (rename if pivoting away) + `MEMORY.md` index
|
||||
|
||||
## What NOT to do
|
||||
|
||||
- **Don't reference SAP 10.3** — track 10.2 deliberately
|
||||
- **Don't widen pin tolerances** to make pins pass — find the bug
|
||||
- **Don't re-investigate any closed work** (Slices .91..124) — all settled
|
||||
- **Don't add new helpers to `domain/sap10_ml/`** — on the deprecation path
|
||||
- **Don't trust handover numeric claims without verifying** against source PDF
|
||||
- **Don't accept "spec-precision floor" framing** without spec-citation work
|
||||
|
||||
## Where to put new Elmhurst fixtures
|
||||
|
||||
When the user supplies the new worksheets:
|
||||
|
||||
- Summary PDFs → `backend/documents_parser/tests/fixtures/Summary_<refno>.pdf`
|
||||
- U985 worksheet PDFs → `sap worksheets/<source-folder>/U985-0001-<refno>.pdf`
|
||||
- Per-cert fixture module → `domain/sap10_calculator/worksheet/tests/_elmhurst_worksheet_<refno>.py`
|
||||
(mirror `_elmhurst_worksheet_000565.py` shape — mapper-driven `build_epc()`)
|
||||
- Add to `_FIXTURE_PINS` + `_FIXTURE_MODULES` in
|
||||
`domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py`
|
||||
- AAA tests for any new mapper gaps go in
|
||||
`backend/documents_parser/tests/test_summary_pdf_mapper_chain.py`
|
||||
|
||||
The user's "same property, multiple heating systems" pattern is ideal:
|
||||
the envelope stays constant across variants, so any SAP/PE/CO2 difference
|
||||
is fully attributable to the heating cascade. That's the cleanest possible
|
||||
test vector for heating-section diagnostics.
|
||||
|
||||
Good luck.
|
||||
234
domain/sap10_calculator/docs/NEXT_AGENT_PROMPT_POST_S0380_124.md
Normal file
234
domain/sap10_calculator/docs/NEXT_AGENT_PROMPT_POST_S0380_124.md
Normal file
|
|
@ -0,0 +1,234 @@
|
|||
# Next-agent prompt — post S0380.124
|
||||
|
||||
You are picking up on branch `feature/per-cert-mapper-validation` at
|
||||
**HEAD `1e69bd39`**. The previous session closed cert 000565 truly
|
||||
exact (Slices S0380.115..119), fixed the §5.11.4 NI-vs-explicit-0
|
||||
roof bug + a basement-cert mapper gap (S0380.120-121), and tightened
|
||||
several test files (S0380.122-124). Extended handover suite: **775
|
||||
pass, 0 fail**.
|
||||
|
||||
You have two tasks from the user (in order):
|
||||
|
||||
1. **Close cert 0240's remaining residual.** The §5.11.4 fix closed
|
||||
most of the gap (PE +12.49 → +0.05, CO2 +0.70 → +0.06) but SAP
|
||||
residual −10 remains. Energy / CO2 match lodged at sub-0.1; cost
|
||||
is the driver.
|
||||
|
||||
2. **Audit large golden-corpus residuals to understand what
|
||||
fixtures we need to add.** The user has additional Elmhurst
|
||||
Summary + U985 worksheet PDFs for **the same property with
|
||||
multiple different heating systems**. Wait for them to share the
|
||||
files, then use the controlled-variable test pattern to localise
|
||||
heating-cascade gaps.
|
||||
|
||||
## Read these first
|
||||
|
||||
In order, before any tool call:
|
||||
|
||||
1. [`HANDOVER_POST_S0380_124.md`](HANDOVER_POST_S0380_124.md) — full
|
||||
state at HEAD `1e69bd39`, hypothesis ranking for cert 0240,
|
||||
golden-corpus residual table.
|
||||
2. [`HANDOVER_POST_S0380_114.md`](HANDOVER_POST_S0380_114.md) — prior
|
||||
state at HEAD `cc70e559` (cert 000565 closure work).
|
||||
|
||||
## Load these memories before starting
|
||||
|
||||
```
|
||||
project_cert_000565_recovery_state # full per-slice history at HEAD 1e69bd39
|
||||
project_sap10_ml_deprecation # domain/sap10_ml/ is retiring
|
||||
feedback_sap_10_2_only_never_10_3 # CRITICAL — never reference SAP 10.3
|
||||
feedback_spec_citation_in_commits # quote spec + page in commits
|
||||
feedback_verify_handover_claims # verify numeric claims against PDF
|
||||
feedback_zero_error_strict # pyright net-zero per touched file
|
||||
feedback_commit_per_slice # one slice = one commit
|
||||
feedback_aaa_test_convention # # Arrange / # Act / # Assert
|
||||
feedback_e2e_validation_philosophy # abs=1e-4 pins, no rel/xfail
|
||||
feedback_abs_diff_over_pytest_approx # use abs(x-y) <= tol
|
||||
feedback_spec_floor_skepticism # verify "spec-precision floor" claims
|
||||
feedback_golden_residuals_near_zero # golden pins should shrink toward 0
|
||||
feedback_worksheet_not_api_reference # worksheet PDF, not API EPC, is the target
|
||||
feedback_one_e_minus_4_across_the_board # 1e-4 is the bar for HP certs too
|
||||
reference_unmapped_sap_code # calculator strict-raise pattern
|
||||
reference_unmapped_api_code # mapper strict-raise pattern
|
||||
```
|
||||
|
||||
## Verify baseline first
|
||||
|
||||
```bash
|
||||
PYTHONPATH=/workspaces/model python -m pytest \
|
||||
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
|
||||
backend/documents_parser/tests/test_elmhurst_extractor.py \
|
||||
backend/documents_parser/tests/test_elmhurst_end_to_end.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_heat_transmission.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_internal_gains.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_solar_gains.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_dimensions.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_rating.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_ventilation.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_appendix_h_solar.py \
|
||||
domain/sap10_calculator/worksheet/tests/test_mev.py \
|
||||
domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
|
||||
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
|
||||
domain/sap10_calculator/tests/test_pcdb_table_322_lookup.py \
|
||||
domain/sap10_calculator/tests/test_pcdb_table_329_lookup.py \
|
||||
--no-cov -q
|
||||
```
|
||||
|
||||
Expected: **775 pass, 0 fail**.
|
||||
|
||||
## Task 1 details — cert 0240 (S0380.125 candidate)
|
||||
|
||||
Current pin in
|
||||
`domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py`:
|
||||
|
||||
```python
|
||||
_GoldenExpectation(
|
||||
cert_number="0240-0200-5706-2365-8010",
|
||||
actual_sap=73,
|
||||
expected_sap_resid=-10,
|
||||
expected_pe_resid_kwh_per_m2=+0.0542,
|
||||
expected_co2_resid_tonnes_per_yr=+0.0626,
|
||||
notes="...detached, TFA 118 [stale — actually 202], age J, oil boiler PCDB-listed + PV + RR on BP[0]..."
|
||||
)
|
||||
```
|
||||
|
||||
Note: the `notes:` field references "PV" but the cert's
|
||||
`sap_energy_source.photovoltaic_supply` is `none_or_no_details` —
|
||||
**no PV**. The note is stale. Update the note as you investigate.
|
||||
|
||||
**Cert 0240 shape (verified 2026-05-30):**
|
||||
|
||||
- property_type=0 (House), built_form=1 (Detached)
|
||||
- TFA 202 m² (NOT 118 as the stale note says)
|
||||
- `walls`: "Sandstone, as built, insulated (assumed)" — solid stone
|
||||
- `roofs`: "Pitched, 400+ mm loft insulation" — well-insulated
|
||||
- `floors`: "Solid, insulated (assumed)" — §5.11.4 fired
|
||||
- `main_heating`: "Boiler and radiators, oil"
|
||||
- `secondary_heating`: None
|
||||
- `solar_water_heating`: N
|
||||
- `mains_gas`: N (off-grid oil)
|
||||
- `meter_type`: 3 (10-hour off-peak)
|
||||
- SAP version 10.2
|
||||
|
||||
**Residual interpretation:**
|
||||
|
||||
- SAP −10 = lodged 73, cascade 63 = cascade fuel cost is HIGHER than
|
||||
lodged
|
||||
- PE +0.05 ≈ 0 (energy demand matches)
|
||||
- CO2 +0.06 ≈ 0 (emissions match)
|
||||
- → Bug is in the cost cascade, not the heat-loss cascade
|
||||
|
||||
**Back-solve the cost gap:**
|
||||
|
||||
`SAP = 100 − 13.95 × ECF` (linear branch). With TFA=202, 45m offset:
|
||||
|
||||
- Lodged SAP 73 → ECF 1.935 → cost £1138.6
|
||||
- Cascade SAP 63 → ECF 2.652 → cost £1559.5
|
||||
- Cascade over-counts by ~£420/yr
|
||||
|
||||
**Hypothesis ranking (start at top):**
|
||||
|
||||
1. **Oil tariff routing**: Cascade may default to electricity 13.19
|
||||
p/kWh for the main-heating cost calc when the cert lodges
|
||||
`meter_type=3` + `main_fuel_type=4` (oil). The 1.3× ratio matches
|
||||
oil-vs-electricity price ratio.
|
||||
2. **HW fuel routing**: Same boiler does HW. Verify HW cost uses oil
|
||||
tariff, not electricity.
|
||||
3. **Standing charge**: Oil has none in Table 32; if cascade adds gas
|
||||
or electricity standing charge, that's £120/yr extra.
|
||||
4. **Off-peak split**: `meter_type=3` lodges a 10-hour off-peak meter.
|
||||
For oil heating this is just the electricity meter for lights /
|
||||
pumps. Cascade may be applying off-peak split to oil energy
|
||||
incorrectly.
|
||||
|
||||
**Approach:**
|
||||
|
||||
1. Probe `result.intermediate` for 0240:
|
||||
`main_heating_cost_gbp`, `hot_water_cost_gbp`, `pumps_fans_cost_gbp`,
|
||||
`lighting_cost_gbp`, `standing_charges_gbp`.
|
||||
2. Compare each sub-cost against the API-lodged numbers (the cert
|
||||
carries `heating_cost_current`, `hot_water_cost_current`,
|
||||
`lighting_cost_current`).
|
||||
3. Identify which sub-cost over-counts by ~£420.
|
||||
4. Trace via `cert_to_inputs` → fuel-tariff resolution to find the
|
||||
wrong route.
|
||||
5. Write AAA test → fix → re-pin.
|
||||
|
||||
## Task 2 details — golden corpus audit
|
||||
|
||||
After task 1, the user will share Elmhurst worksheet + Summary PDFs
|
||||
for **the same property with multiple different heating systems**.
|
||||
|
||||
**Why this is valuable:** A controlled-variable test set. Same
|
||||
envelope → fabric heat loss is identical across variants → any SAP /
|
||||
PE / CO2 difference between variants is fully attributable to the
|
||||
heating cascade. This pins the heating subsystem at PDF precision
|
||||
rather than the API-residual precision the current golden corpus
|
||||
provides.
|
||||
|
||||
**Where to put the new fixtures:**
|
||||
|
||||
- Summary PDF: `backend/documents_parser/tests/fixtures/Summary_<refno>.pdf`
|
||||
- U985 worksheet PDF: `sap worksheets/<source-folder>/U985-0001-<refno>.pdf`
|
||||
- Fixture module: `domain/sap10_calculator/worksheet/tests/_elmhurst_worksheet_<refno>.py`
|
||||
(mirror `_elmhurst_worksheet_000565.py` — mapper-driven `build_epc()`)
|
||||
- Add to `_FIXTURE_PINS` + `_FIXTURE_MODULES` in
|
||||
`domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py`
|
||||
|
||||
**Per-cert workflow:**
|
||||
|
||||
1. Extract worksheet PDF text via `pdftotext -layout`.
|
||||
2. Pin Block 1 (energy rating) line refs: `(255)`, `(257)`, `(258)`,
|
||||
`(272)`, `(98c)`, `(211)`, `(219)`, `(232)`, `(231)`.
|
||||
3. Run `Sap10Calculator().calculate(epc)` and identify which pins fail.
|
||||
4. Each failing pin → AAA test in `test_summary_pdf_mapper_chain.py`
|
||||
→ cascade / mapper fix → commit with spec citation.
|
||||
|
||||
**Top golden-corpus residuals to address (after task 1):**
|
||||
|
||||
| Cert | SAP / PE / CO2 residuals | Shape clue |
|
||||
|---|---|---|
|
||||
| 0390-2954-3640-2196-4175 | −6 / **−26.4** / **−2.55** | Off-grid oil (?) on a TFA 360 m² dwelling |
|
||||
| 6035-7729-2309-0879-2296 | −6 / **+46.1** / **+1.05** | Mid-terrace age A gas combi, TFA 128 |
|
||||
| 7536-3827-0600-0600-0276 | +1 / −7.08 / −0.19 | Gas combi (modest gap) |
|
||||
| 2130-1033-4050-5007-8395 | +1 / −7.50 / −0.05 | Gas combi + PV |
|
||||
|
||||
The user's new fixtures may not match these certs directly, but the
|
||||
"same property × heating variants" pattern they're providing will
|
||||
isolate heating-cascade behaviour for any of these shapes.
|
||||
|
||||
## What NOT to do
|
||||
|
||||
- **Don't reference SAP 10.3** ([[feedback-sap-10-2-only-never-10-3]])
|
||||
- **Don't widen pin tolerances** to make pins pass ([[feedback-zero-error-strict]])
|
||||
- **Don't re-investigate closed work** — Slices .91..124 all settled
|
||||
- **Don't add new helpers to `domain/sap10_ml/`** — on the deprecation path
|
||||
- **Don't trust the cert 0240 `notes:` field at face value** — the
|
||||
"PV + TFA 118" is stale; verify against the JSON
|
||||
- **Don't pin downstream-only metrics with tight thresholds** —
|
||||
S0380.103 pattern: pin the narrowest intermediate the slice changes
|
||||
|
||||
## Memory hygiene
|
||||
|
||||
After each slice:
|
||||
|
||||
1. Update `project_cert_000565_recovery_state` (consider renaming the
|
||||
memory if the current session pivots away from 000565). It tracks
|
||||
per-slice history.
|
||||
2. Update `MEMORY.md` — keep the HEAD pointer current.
|
||||
|
||||
## User direction
|
||||
|
||||
The user's direction (from the closing session message):
|
||||
|
||||
> "Let's fix 0240. Then, I have some more test files (elmhurst summary
|
||||
> reports + worksheet) to help improve. They're the same property
|
||||
> with multiple different heating systems. I want to understand why
|
||||
> we still have such large residuals in our golden fixtures from the
|
||||
> API I can understand what test examples we need."
|
||||
|
||||
→ Task 1 first. Then prompt the user to share the worksheet files
|
||||
when you're ready to start task 2.
|
||||
|
||||
Good luck.
|
||||
Loading…
Add table
Reference in a new issue