docs: handover + research brief + next-agent prompt for cert 000565 Appendix H

Session-end handover docs for the cert 000565 wacky-stress-test
investigation. Three documents covering:

- **HANDOVER_POST_S0380_73_APPENDIX_H_BLOCKED.md** — full state
  of the cohort closure work (S0380.70-.73) plus the Appendix H
  Solar HW investigation findings. Cumulative ASHP cluster
  compression −3.10 → −0.06 PE kWh/m² over 4 slices. Cert
  000565 HW pin blocked at +272 kWh/yr on a 1.81× formula
  over-count.

- **BRIEF_APPENDIX_H_EN_15316_RESEARCH.md** — self-contained
  brief for a research agent or human looking up BS EN 15316-4-3
  Method 2 to identify the missing clamp / useful-gain rule /
  validity envelope behind the over-count. Includes the cert
  000565 diagnostic (per-month ratio 1.5-1.7× summer, 3-4×
  shoulder), seven specific questions ranked by hypothesis
  likelihood, and the 36-data-point empirical-fit setup.

- **NEXT_AGENT_PROMPT_POST_S0380_73.md** — directive for the
  next agent. Awaits 3 user-generated solar-HW cert worksheets
  (A baseline / B high-Y / C low-Y) to empirically test whether
  the 1.81× ratio is systematic or cert-specific. Decision
  point: ship an empirical correction (if 36-point fit closes
  all 3 certs + cert 000565) or hold for the EN standard.

Also resolves the long-standing H3=4.0 / H4=0.01 default mystery:
sub-agent located the source in RdSAP 10 Specification §10.11
Table 29 row "Solar panel" page 58. RdSAP overrides the input
set; the calculator is still SAP 10.2 Appendix H. So the
defaults aren't the source of the over-count.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Khalim Conn-Kowlessar 2026-05-29 16:18:31 +00:00 committed by Jun-te Kim
parent d9cae3684b
commit a4580b208a
3 changed files with 761 additions and 0 deletions

View file

@ -0,0 +1,257 @@
# Research brief — SAP 10.2 Appendix H solar HW vs BS EN 15316-4-3:2017
## Goal
Localise the bug that causes our SAP 10.2 Appendix H orchestrator
([domain/sap10_calculator/worksheet/appendix_h_solar.py](../worksheet/appendix_h_solar.py))
to compute monthly solar hot-water heat delivered **1.81× higher than
the Elmhurst U985 worksheet** for cert 000565
(`sap worksheets/extended test case/U985-0001-000565.pdf`). The
discrepancy is the dominant remaining gap in cert 000565's HW pin
(+272 kWh/yr cascade over worksheet).
## What we already know
### SAP 10.2 Appendix H spec text
Located at
[domain/sap10_calculator/docs/specs/sap-10-2-full-specification-2025-03-14.pdf](specs/sap-10-2-full-specification-2025-03-14.pdf),
pages 74-78. The relevant equations are reproduced in this brief
under "What we implemented" below.
### S10TP-04 (BRE technical note)
[domain/sap10_calculator/docs/specs/sap10 technical papers/S10TP-04 - Change to Appendix H to include solar space heating - V1_3.pdf](specs/sap10%20technical%20papers/S10TP-04%20-%20Change%20to%20Appendix%20H%20to%20include%20solar%20space%20heating%20-%20V1_3.pdf)
confirms that **SAP 10.2 Appendix H implements Method 2 of BS EN
15316-4-3:2017** (M3-8-3, M8-8-3, M11-8-3 modules). It states:
> "The method itself is not reproduced in this technical note it
> is fully described in the Standard"
So the authoritative formula lives in EN 15316-4-3:2017 Method 2,
and the SAP spec text on p.76 is a (potentially abbreviated /
typo-prone) restatement.
### What we implemented
Per SAP 10.2 spec p.76 verbatim:
```
(H7)m = Appendix U §U3.3 tilted solar flux on collector aperture [W/m²]
(H9)m = (H1) × (H2) × (H7)m × (H8) [W]
(H10) = 5 + 0.5 × (H1) OR test-certificate value [W/K]
(H11) = (H3) + 40·(H4) + (H10)/(H1) [W/m²K]
(H14) = (H12) [separate] OR (H12) + 0.3·((H13)-(H12)) [combined] [L]
(H15) = 75 × (H1) [L]
(H16) = ((H15)/(H14))^0.25 [-]
(H17)m = (62)m (63a)m [kWh/month]
(H18)m = 1 (HW-only) | 0 (SH-only) | (H17)/(H17+(98a)) (blended) [-]
(H20)m = 55 + 3.86·Tcold,m 1.32·(96)m [°C]
(H21)m = (H20)m (96)m [K]
(H22)m = [(H18)·(H1)·(H11)·(H5)·(H21)·(H16)·((41)·24)] / [1000·(H17)] [-]
clamped to [0, 18]
(H23)m = [(H18)·(H6)·(H5)·(H9)·((41)·24)] / [1000·(H17)] [-]
clamped to ≥ 0
(H24)m = [Ca·Y + Cb·X + Cc·Y² + Cd·X² + Ce·Y³ + Cf·X³] × (H17)m [kWh]
clamped to [0, (H17)m]
```
Where `X = (H22)m`, `Y = (H23)m`, `(41)m` is days-in-month per spec
p.136, `(96)m` is external temp (Appendix U region 0 for SAP
rating), `Tcold,m` is mains cold-water temp from Table J1.
Coefficients per spec Table H3 (p.78):
- Ca = 1.029
- Cb = 0.065
- Cc = 0.245
- Cd = 0.0018
- Ce = 0.0215
- Cf = 0
### Concrete diagnostic — cert 000565 (UK average climate, region 0)
Inputs (all verified against worksheet):
| Var | Value | Notes |
|---|---|---|
| H1 | 3.0 | aperture m² |
| H2 | 0.8 | zero-loss efficiency |
| H3 | 4.0 | linear heat loss coefficient |
| H4 | 0.01 | second order heat loss coefficient |
| H5 | 0.9 | loop efficiency (default; no test cert) |
| H6 | 0.94 | incidence angle modifier (flat plate) |
| H8 | 0.8 | overshading factor (Modest) |
| H10 | 6.5 | overall heat loss (test-certificate value) |
| H11 | 6.5667 | matches worksheet |
| H12 | 53 L | dedicated solar storage |
| H13 | 160 L | total cylinder volume |
| H14 | 85.1 L | matches worksheet |
| H15 | 225 L | matches worksheet |
| H16 | 1.2752 | matches worksheet |
Collector: West, 30° pitch. Climate: UK average (region 0) since
Block 1 SAP rating.
**Cascade vs worksheet per-month (H24)m kWh:**
| Month | Cascade | Worksheet | Ratio |
|---|---:|---:|---:|
| Jan | 0 | 0 | |
| Feb | 0 | 0 | |
| Mar | 32.48 | 7.27 | **4.47×** |
| Apr | 71.96 | 34.93 | 2.06× |
| May | 106.53 | 66.05 | 1.61× |
| Jun | 95.82 | 60.01 | 1.60× |
| Jul | 90.52 | 58.25 | 1.55× |
| Aug | 72.54 | 42.25 | 1.72× |
| Sep | 39.93 | 12.58 | **3.17×** |
| Oct | 0 | 0 | |
| Nov | 0 | 0 | |
| Dec | 0 | 0 | |
| **Σ** | **509.78** | **281.35** | **1.81×** |
**Worksheet (H24)m values from
`sap worksheets/extended test case/U985-0001-000565.pdf` page 4.**
## Pattern clues
The per-month ratio is **not constant**:
- High-irradiation months (May-Aug): 1.55-1.72× over — looks like a
uniform ~1.7× scaling.
- Edge months (Mar, Sep): 3-4× over — much worse than middle months.
A uniform multiplicative bug would give the same ratio every month.
The non-uniform pattern suggests one of:
- A missing **threshold or clamp** that zeros out small contributions.
- An additional **subtractive term** that's irradiation-dependent
(so it's significant when irradiation is low, negligible when high).
- A different **polynomial form** that has a steeper rolloff at low Y
(Y is the irradiation-driven term).
Specifically, if there's a term `k·H17/X` or `k·H17/Y²` somewhere,
it would dominate at low Y / high X / large H17 — i.e., the
shoulder-season months.
## Constants we've ruled out
The handover doc
[HANDOVER_POST_S0380_69.md](HANDOVER_POST_S0380_69.md) records that
prior agents tried these tweaks, none of which closed the gap:
- Removing H8 from H9 (top-level Eqn H1 commentary uses
H1·H2·H6·η0·ηloop·Im, no H8 — inconsistent with line-ref (H23))
- Keeping H8 in H9 (current)
- Adding H5/H6 to H9 instead of having them in X/Y separately
- Dividing by H8 inside X
- Using horizontal solar flux instead of tilted
Also verified by this brief author:
- Polynomial coefficients match Table H3 verbatim.
- (H7) tilted-flux conversion via Appendix U §U3.3 is correct.
- (96)m external temps for region 0 match worksheet exactly.
- (62)m HW demand monthly matches worksheet exactly.
- All five "input" helpers (H10, H11, H14, H15, H16) match worksheet
to 4 decimal places.
- (41)m × 24 = days × 24 = hours-in-month per spec p.136.
- Im (Table U3) is the standard 24-hour-averaged W/m² (not daylight
only).
## What we need from EN 15316-4-3:2017
The standard is **108 pages**. Method 2 is the relevant slice (M3-8-3,
M8-8-3, M11-8-3 modules per S10TP-04). The portion we need probably
fits in 4-8 pages.
### Specific questions
1. **What is the exact Method 2 form of Equation H1 (Qs polynomial)?**
Does it have the same six coefficient terms as SAP Table H3, or
are there additional terms? Solar-thermal performance regressions
frequently include **mixed interaction terms** that SAP's
pure-power-of-X, pure-power-of-Y formulation omits:
- `Cg · X·Y`
- `Ch · X·Y²`
- `Ci · X²·Y`
- `Cj · Y/X` (or `X/Y`)
- A tank-loss term proportional to `(H17) × time`
- An irradiation-dependent subtractive term
The seasonal pattern of our over-count (uniform in summer,
much worse in shoulder months) is consistent with one or more
missing mixed terms — pure-X / pure-Y additions would shift the
ratio uniformly across months.
2. **What is the exact Method 2 form of factor X (heat-loss factor)
and factor Y (irradiation factor)?** Does Method 2 multiply by the
same group of inputs as SAP (H22) / (H23)? In particular, does
Method 2 include a term that SAP's restatement on p.76 omits?
3. **Are there any clamps, thresholds, validity ranges, or cutoffs
in Method 2 that the SAP spec didn't reproduce?** Specifically:
- A lower threshold on Y (or on Im) below which Qs = 0?
- A threshold on the storage tank correction H16?
- A "useful heat" filter that excludes months where solar
contribution < some % of demand?
- A "minimum collector temperature rise" filter (collector outlet
must exceed inlet by some ΔT before solar is credited)?
- A "minimum solar fraction" gate?
4. **What are the validity / applicability ranges that Method 2
states for X and Y?** Regression-based correlation methods are
fit over a specific X / Y range and are explicitly invalid
outside that envelope. If the SAP spec doesn't reproduce the
range bounds, the cascade may be applying the polynomial in
shoulder months where Method 2 specifies a different rule
(zero, capped, interpolated). For cert 000565 our cascade
computes:
- X ranges from 3.98 (Jan) to 7.95 (Jul); always within the
SAP-stated [0, 18] clamp.
- Y ranges from 0.095 (Dec) to 1.34 (Jun); always > 0.
Does EN 15316 Method 2 state a Y_min below which the polynomial
doesn't apply? Does it state an X_max < 18?
5. **Is the "hot water reference temperature" formula (SAP H20:
`55 + 3.86·Tcold 1.32·Text`) Method 2's formula or a SAP-specific
substitute?** S10TP-04 mentions SAP uses a 41°C mixed-water
temperature for HW which differs from EN 15316. Are there other
SAP substitutions in this section that the spec didn't flag?
6. **Does Method 2 use the same irradiation Im as a 24-hour-averaged
monthly W/m², or as a different averaging period (e.g. daylight
hours only)?** S10TP-04 says SAP retains Appendix U for irradiance
("UK specific conditions"), but it's unclear whether the
downstream consumer of Im in Method 2 expects the same averaging
convention.
7. **What is the relationship between (H21) "HW reference temperature
difference" and Method 2's ΔTm?** SAP p.76 defines
(H21)m = (H20)m (96)m. Is this the same ΔT that EN 15316
Method 2 uses, or does Method 2 use a different reference (e.g.
collector outlet temperature, ambient + storage temperature
blend)?
### Format we'd ideally get back
A markdown table or short note that lists:
| SAP 10.2 line | SAP 10.2 spec formula | EN 15316-4-3 Method 2 formula | Difference (if any) |
|---|---|---|---|
| (H22) | … | … | … |
| (H23) | … | … | … |
| (H24) polynomial | … | … | … |
| … | … | … | … |
Plus any clamps / thresholds the SAP spec elided.
If the standard exposes intermediate values for a worked example
(e.g. a reference cert), the per-month X / Y / Q numbers for that
example would let us verify our orchestrator against EN-method ground
truth directly.
## Reference: where this matters
Fixing this would close **~272 kWh/yr** on cert 000565's HW pin (3rd
largest open residual on the wacky-stress-test cert). It would also
make the Appendix H orchestrator (currently landed but **not
integrated** into `water_heating_from_cert.solar_monthly_kwh` at
[domain/sap10_calculator/worksheet/water_heating.py:943](../worksheet/water_heating.py#L943))
safe to wire in — without the fix, integrating would *worsen* the
residual (cert 000565 would go from +272 to 131 kWh/yr).

View file

@ -0,0 +1,285 @@
# Handover — post S0380.70..73 + Appendix H investigation blocked on external standard
Branch: `feature/per-cert-mapper-validation`. **HEAD `c63d6740`**.
Predecessor: [`HANDOVER_POST_S0380_69.md`](HANDOVER_POST_S0380_69.md).
## Slices committed this session (S0380.70..73)
The Table 12d/12e header rule ("electricity → monthly cascade
regardless of tariff") was applied consistently across every
electric end-use:
| Slice | Commit | What |
|---|---|---|
| **S0380.70** | `fc68fb21` | Secondary heating CO2/PE routed through lodged `secondary_fuel_type` (mirror of the cost-side fix). Closed cert 2102 (House coal secondary, +20.36 → +0.20 PE) + cohort-1 cert 0300-2747 (mains-gas secondary, +8.28 → +0.93 PE). |
| **S0380.71** | `3d6cf5ea` | STANDARD-tariff electric main_heating PE/CO2 monthly cascade. New `_main_heating_primary_factor` helper mirroring `_main_heating_co2_factor_kg_per_kwh` from S0380.65. Dropped STANDARD-tariff annual-flat fallback in both helpers. |
| **S0380.72** | `b0c4c6e0` | Hot water PE/CO2 monthly cascade. New `_hot_water_co2_factor_kg_per_kwh` + `_hot_water_primary_factor` helpers. Replaced 4 hardcoded `_STANDARD_ELECTRICITY_FUEL_CODE` and annual-flat factor call sites. |
| **S0380.73** | `c63d6740` | Appendix M1 §3a D_PV cooking uses **L20 electricity** (138+28N) not **L18 heat gain** (35+7N watts × hours). 2.21× over-count fixed. Cohort cluster mean PE residual: 0.36 → 0.06 kWh/m² (cumulative S0380.71-.73: 48× compression). Surfaced 12 gas-combi PV certs at +0.5-1.6 PE (separate gas-fuel PE bug — re-pinned). |
**Test baseline at HEAD `c63d6740`:** 547 pass + 9 expected
`test_sap_result_pin[000565-*]` cascade-gap fails.
Pyright net-zero on every touched file.
## Cumulative ASHP cohort cluster closure (20 STANDARD-tariff certs)
| Stage | Mean PE residual | Worst (cert 9796) |
|---|---:|---:|
| Pre-S0380.71 | 3.10 kWh/m² | 4.18 |
| Post-S0380.71 (main heating) | 0.66 | 1.36 |
| Post-S0380.72 (HW) | 0.36 | 1.08 |
| Post-S0380.73 (cooking) | **0.06** | **0.53** |
Compression: 48× on the mean, 8× on the worst cert. All 20 cluster
certs now within ±0.53 kWh/m² of lodged values. Residuals scattered
around zero (was overwhelmingly negative).
## Open thread #1 — Cert 000565 Appendix H Solar HW (BLOCKED on EN 15316-4-3:2017)
Cert 000565 has 9 expected `test_sap_result_pin[000565-*]` failing
pins. The two biggest energy residuals are blocked on external data:
| Pin | Δ | Status |
|---|---:|---|
| sap_score (int) | **0** ✓ EXACT | unchanged |
| sap_score_continuous | +0.6334 | sub-spec |
| ecf | 0.0643 | sub-spec |
| total_fuel_cost | 56.08 | sub-spec |
| co2 | 19.77 | sub-spec |
| **space_heating** | **+266.11** | **BLOCKED — RR fold-in needs RdSAP §3.10 detailed-RR geometry** |
| main_heating_fuel | +156.53 | follows space_heating |
| **hot_water** | **+271.84** | **BLOCKED — Appendix H magnitude (see below)** |
| lighting | +2.19 | sub-spec |
| pumps_fans | +2.48 | blocked — PCDB MEV record not in repo |
### Appendix H deep dive (NEW this session)
Cert 000565 has solar HW lodged. Block 1 SAP rating expects
H24=281.35 kWh/yr; our orchestrator gives **509.78 kWh/yr → 1.81×
over-count**.
**Verified by this session's investigation:**
- All inputs (H1-H8, H10-H16) match worksheet to 4 decimal places.
- All (H17)-(H23) formulas implement SAP 10.2 spec p.76 verbatim.
- Polynomial coefficients (Ca-Cf) match spec Table H3 verbatim.
- (H7) tilted-flux conversion via Appendix U §U3.3 is correct.
- (96)m external temps for region 0 match worksheet exactly.
- (62)m HW demand monthly matches worksheet exactly.
**Per-month pattern (the strong clue):**
| Month | Cascade | Worksheet | Ratio |
|---|---:|---:|---:|
| Mar | 32.48 | 7.27 | **4.47×** |
| Apr | 71.96 | 34.93 | 2.06× |
| May | 106.53 | 66.05 | 1.61× |
| Jun | 95.82 | 60.01 | 1.60× |
| Jul | 90.52 | 58.25 | 1.55× |
| Aug | 72.54 | 42.25 | 1.72× |
| Sep | 39.93 | 12.58 | **3.17×** |
Non-uniform ratio (1.5-1.7× in summer, 3-4× in shoulder months)
suggests a **missing clamp / validity envelope / useful-gain
suppression** rather than a polynomial-coefficient error.
**External research findings (ChatGPT-mediated, this session):**
- A publicly visible draft of prEN 15316-4-3 shows Table B.1
coefficients matching SAP Table H3 exactly (Ca=1.029, Cb=0.065,
…). So the polynomial isn't the bug.
- The 6-term polynomial structure (no XY/X²Y/XY² interaction terms)
appears canonical in the f-chart method literature.
- ChatGPT's verdict: the bug is likely in a Method 2 applicability
range, useful-gain suppression rule, or load-normalisation
definition that SAP didn't reproduce.
**Research brief documenting the diagnostic:**
[`BRIEF_APPENDIX_H_EN_15316_RESEARCH.md`](BRIEF_APPENDIX_H_EN_15316_RESEARCH.md).
This is the document to hand to a research agent / human if BS EN
15316-4-3:2017 access is available.
**Current investigation path:** the user is generating 3 simple
solar-HW cert worksheets (A baseline, B high-Y, C low-Y) to
empirically test whether the 1.81× ratio is systematic across all
solar HW shapes or specific to cert 000565. Across 3 certs:
~36 month-data-points should let us empirically fit any missing
correction term. See "Continuation instructions" below.
## Open thread #2 — Elmhurst RdSAP solar HW collector defaults (RESOLVED)
Cert 000565 worksheet uses H3=4.0, H4=0.01 for "Solar collector
details known: No". These don't match SAP 10.2 Table H1 (flat plate
a1=3.5, a2=0).
**Source identified by sub-agent this session: RdSAP 10
Specification §10.11 Table 29 "Heating and hot water parameters",
row "Solar panel", page 58.** Verbatim:
> "If solar panel present, the parameters for the calculation not
> provided in the RdSAP data set are:
> - panel aperture area 3 m²
> - **flat panel, η₀ = 0.80, a₁ = 4.0, a₂ = 0.01**
> - facing South, pitch 30°, modest overshading
> - …
> - pump for solar-heated water is electric (75 kWh/year)
> - showers are both electric and non-electric"
So RdSAP overrides the **input set** (a1, a2) but SAP 10.2's
Appendix H is still the calculator. Our orchestrator uses the
right Table 29 inputs (matching the worksheet), so this is **NOT**
the source of the 1.81× over-count. The over-count is in the
Appendix H formula itself.
This resolves a long-standing default-source mystery but doesn't
help with the H24 over-count.
**Important:** changing H3/H4 to SAP Table H1 spec defaults makes
the H24 over-count *worse* on cert 000565, not better.
## Open thread #3 — 12 gas-combi PV certs at +0.5-1.6 PE
S0380.73 cooking fix surfaced 12 gas-combi PV certs at residuals
+0.5 to +1.6 PE (cohort-1 cert 2130 + 11 cohort-2). Pre-S0380.73 a
compensating bug (the cooking over-count) masked this. Now visible
but **no worksheets available** for these certs — same "unanchored
chase" situation as the 5 SAP-residual certs. Re-pinned at current
residuals; investigation deferred until worksheets land.
## Open thread #4 — 5 SAP-integer-residual certs
Total 14 |Δsap| points outstanding across 5 API-only certs (notes
in `test_golden_fixtures.py`):
| Cert | Δsap | Shape |
|---|---|---|
| 0240 | 14 | Oil boiler + PV + RR |
| 0390 | 7 | Oil boiler, 360m², age F masonry |
| 6035 | 6 | Gas combi age A (pre-1900) |
| 7536 | +1 | Multi-age extensions (D/L/F) |
| 2130 | +1 | Shifted from 0 by S0380.73 cooking fix |
All API-only (no worksheets). User has agreed not to chase these
without worksheet ground truth (per session discussion).
## How to run the baseline
```bash
PYTHONPATH=/workspaces/model python -m pytest \
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
backend/documents_parser/tests/test_elmhurst_extractor.py \
backend/documents_parser/tests/test_elmhurst_end_to_end.py \
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
domain/sap10_calculator/worksheet/tests/test_appendix_h_solar.py \
domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
--no-cov -q
```
Expected: **547 pass + 9 expected `test_sap_result_pin[000565-*]`
fails**.
## Continuation instructions for next agent
The user is generating 3 solar-HW cert worksheets (A baseline, B
high-Y, C low-Y) per the spec at end of this session. They'll land
in `sap worksheets/Solar HW tests/` or similar. Each cert directory
contains:
- A Summary_NNNNNN.pdf
- A P960-0001-NNNNNN.pdf (worksheet equivalent of dr87/U985)
### When the certs land
1. **Run the orchestrator** for each cert with the worksheet's H1-H8
inputs:
```python
from domain.sap10_calculator.worksheet.appendix_h_solar import (
solar_water_heating_input_monthly_kwh,
)
from domain.sap10_calculator.worksheet.water_heating import (
TABLE_J1_TCOLD_FROM_MAINS_C,
)
from domain.sap10_calculator.worksheet.solar_gains import Orientation
from domain.sap10_calculator.climate.appendix_u import external_temperature_c
# H1-H8 from worksheet's Appendix H section (page 4 in U985 format)
# (62)m monthly HW demand from worksheet
# te from external_temperature_c(region, m) for region 0 (Block 1
# SAP rating)
result = solar_water_heating_input_monthly_kwh(...)
```
2. **Extract worksheet (H24)m monthly values** from the cert's P960
PDF page 4 (or wherever Block 1 sits) using:
```python
from pypdf import PdfReader
r = PdfReader(path_to_p960_pdf)
# Look for (63c) "Solar input" row in the §4 HW section
# Or (H24)m line directly in the Appendix H section
```
3. **Build a 36-point dataset** (3 certs × 12 months) of (cascade
H24, worksheet H24, X_cascade, Y_cascade, H17_cascade) and check:
- Does the per-month ratio show the same shape across all 3?
(Summer ~1.7×, shoulder 3-4×) → confirms systematic bug.
- Or does the ratio vary by cert? → suggests cert-specific input
differences.
4. **Empirical fit attempt:** if the ratio pattern is systematic,
try fitting:
```
Qs_corrected = Qs_cascade × g(X, Y, H17)
```
for various g shapes (multiplicative, additive,
threshold-dependent). The fitted correction term + 3-cert
validation gives us a temporary closure even without the EN
standard.
5. **Decision point:** if empirical fit closes all 3 cert + cert
000565 to <50 kWh/yr residual, ship as a spec-citation-pending
slice (note in the commit + memory that it's empirical pending
EN 15316-4-3 verification). Otherwise wait for the standard.
6. **Integration:** if cert 000565 HW gap closes via this work,
wire the orchestrator into
[`domain/sap10_calculator/worksheet/water_heating.py:943`](../worksheet/water_heating.py#L943)
(currently hardcoded `solar_monthly_kwh=zero12`). This is the
step that lets cert 000565's HW pin go from +272 → ~0.
### What NOT to do
- Don't redo the work the prior agent already verified:
- Don't re-test H1-H8 input matching (verified)
- Don't re-test polynomial coefficients (matches Table H3)
- Don't re-test (H7) flux conversion (verified)
- Don't re-test SAP spec formula transcription (verbatim)
- Don't chase the 12 gas-combi PV certs or the 5 SAP-residual certs
without worksheets — user has explicitly de-prioritised those.
- Don't integrate the orchestrator into the cascade with the
current 1.81× over-estimate — that would WORSEN cert 000565's HW
residual from +272 → 131 per the handover prediction.
## Key files touched this session
| File | Touched in |
|---|---|
| `domain/sap10_calculator/rdsap/cert_to_inputs.py` | All 4 slices — new helpers + 4 rewires |
| `domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py` | 5 new tests |
| `domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py` | 33 cluster pin updates across S0380.71-.73 |
| `domain/sap10_calculator/docs/BRIEF_APPENDIX_H_EN_15316_RESEARCH.md` | NEW — research brief |
| `domain/sap10_calculator/docs/HANDOVER_POST_S0380_73_APPENDIX_H_BLOCKED.md` | NEW — this doc |
## Spec source quick-reference
- **SAP 10.2 full specification**: `domain/sap10_calculator/docs/specs/sap-10-2-full-specification-2025-03-14.pdf`
- Table 12d (monthly electric CO2 factors): p.195
- Table 12e (monthly electric PE factors): p.196
- Appendix H (solar thermal): p.74-78, Table H1 p.78, Table H3 p.78
- Appendix L (cooking electricity L20): p.91
- Appendix M1 §3a (D_PV definition): p.93-94
- **S10TP-04** (BRE Appendix H change note): `domain/sap10_calculator/docs/specs/sap10 technical papers/S10TP-04 - Change to Appendix H to include solar space heating - V1_3.pdf`
- **SAP 10.3** at `domain/sap10_calculator/docs/specs/sap-10-3-full-specification-2026-01-13.pdf`: **DO NOT reference** (project tracks 10.2 only per [[feedback-sap-10-2-only-never-10-3]]).

View file

@ -0,0 +1,219 @@
# Next-agent prompt — post S0380.73
Branch: `feature/per-cert-mapper-validation`.
HEAD: `c63d6740`.
Read these in order before any tool call:
1. [`HANDOVER_POST_S0380_73_APPENDIX_H_BLOCKED.md`](HANDOVER_POST_S0380_73_APPENDIX_H_BLOCKED.md) (full state)
2. [`BRIEF_APPENDIX_H_EN_15316_RESEARCH.md`](BRIEF_APPENDIX_H_EN_15316_RESEARCH.md) (the unblock-this brief)
Also load these memories before starting:
- `project_cert_000565_recovery_state` — cert 000565 slice history
- `project_golden_coverage_state` — cohort state + S0380.70-.73 closures
- `feedback_sap_10_2_only_never_10_3`**CRITICAL** — never reference SAP 10.3 spec
- `feedback_verify_handover_claims` — verify spec citations before implementing
- `feedback_spec_floor_skepticism` — "spec-precision floor" framing usually hides a real spec bug
- `feedback_zero_error_strict` — pyright net-zero per touched file
- `feedback_commit_per_slice` — one slice = one commit
- `feedback_spec_citation_in_commits` — quote spec text + page in commit messages
- `feedback_aaa_test_convention` — every new test uses `# Arrange / # Act / # Assert`
- `feedback_e2e_validation_philosophy` — component pins at <1e-3; SAP integer delta=0; no adaptive ceilings
## State summary
**Cumulative session result:** test baseline went from 317 pass + 9
expected fails → **547 pass + 9 expected fails**. The 9 expected
fails are all `test_sap_result_pin[000565-*]` and reflect cert
000565's residual gaps. Four cohort closure slices (S0380.70-.73)
applied the SAP 10.2 Table 12d/12e header rule consistently to
secondary heating, main heating, hot water, and the Appendix M1
§3a D_PV cooking electricity formula.
The ASHP cohort cluster of 20 STANDARD-tariff certs compressed
from mean PE residual 3.10 → 0.06 kWh/m² across these 4 slices.
That cluster is now closed.
The remaining outstanding work is on cert 000565 (the wacky stress
test). Two of cert 000565's biggest residuals (HW +272, space
heating +266) are blocked on external data:
- **Appendix H Solar HW magnitude** (HW +272 kWh/yr cascade over):
orchestrator at
[`domain/sap10_calculator/worksheet/appendix_h_solar.py`](../worksheet/appendix_h_solar.py)
produces 509.78 vs worksheet 281.35 = 1.81× over. All inputs
verified, all SAP 10.2 spec formulas implemented verbatim. The
bug is in a missing Method 2 clamp / validity envelope / useful-
gain suppression that SAP didn't reproduce from BS EN 15316-4-3.
- **RR fold-in** (space_heating +266 kWh/yr): blocked on RdSAP §3.10
detailed-RR geometry / area formula not in repo.
## Recommended next slice (the Appendix H empirical approach)
The user is generating 3 simple solar-HW cert worksheets to
empirically test the 1.81× over-count. These will land in
`sap worksheets/Solar HW tests/` (or similar) as:
- `A-baseline-south-modest/` (South, 30°, Modest overshading)
- `B-highY-south-none/` (South, 30°, None/very-little overshading)
- `C-lowY-north-significant/` (North or East, 60°, Significant
overshading)
Each cert directory contains a `Summary_NNNNNN.pdf` and a
`P960-0001-NNNNNN.pdf` (Elmhurst worksheet). All 3 certs share the
same base envelope (28 Distillery Wharf, semi-detached, TFA 90 m²,
age G, masonry cavity walls). Each cert has "Solar collector
details known: No" so they all use RdSAP 10 Table 29 defaults
(H3=4.0, H4=0.01 — verified in this session).
### When the certs land
1. **Run the orchestrator** for each cert. The probe pattern:
```python
from domain.sap10_calculator.worksheet.appendix_h_solar import (
solar_water_heating_input_monthly_kwh,
)
from domain.sap10_calculator.worksheet.water_heating import (
TABLE_J1_TCOLD_FROM_MAINS_C,
)
from domain.sap10_calculator.worksheet.solar_gains import Orientation
from domain.sap10_calculator.climate.appendix_u import external_temperature_c
# H1-H8 from worksheet's Appendix H section
# (62)m monthly HW demand from worksheet (look for line ref (62))
# external temps for region 1 (Thames Valley) Block 1 SAP rating
te = tuple(external_temperature_c(1, m) for m in range(1, 13))
result = solar_water_heating_input_monthly_kwh(
collector_orientation=Orientation.S, # or N/E per cert
collector_pitch_deg=30.0, # or 60 per cert
region=1,
aperture_area_m2=3.0, # RdSAP Table 29 default
zero_loss_efficiency=0.8,
linear_heat_loss_a1=4.0, # RdSAP Table 29 default
second_order_heat_loss_a2=0.01, # RdSAP Table 29 default
loop_efficiency=0.9,
incidence_angle_modifier=0.94,
overshading_factor=0.8, # or 1.0 / 0.65 per cert
overall_heat_loss_coefficient_from_test=6.5,
dedicated_solar_storage_volume_l=...,
combined_cylinder_total_volume_l=...,
hot_water_demand_monthly_kwh=(62)m,
wwhrs_monthly_kwh=(0.0,) * 12,
cold_water_temperatures_monthly_c=TABLE_J1_TCOLD_FROM_MAINS_C,
external_temperatures_monthly_c=te,
solar_hot_water_only=True,
)
```
2. **Extract worksheet (H24)m monthly** from each cert's P960 PDF.
For cert 000565 the (H24)m values are on page 4 (Block 1 SAP
rating). Look for line `(63c)` solar input row (12 monthly
values, negative sign convention) — its absolute value equals
(H24)m. Or find `Heat delivered to hot water` row with `(H24)`
label at the end.
3. **Build a 36-point dataset** of `(cascade_H24, worksheet_H24,
X_cascade, Y_cascade, H17_cascade)` across the 3 certs.
4. **Diagnostic analysis:**
- Does the per-month ratio show the same shape across all 3?
(Summer 1.5-1.7×, shoulder 3-4×) → confirms the bug is in the
formula, not in cert 000565's specific inputs.
- Does the ratio vary with Y? Plot ratio vs Y to look for a
threshold (sharp transition).
- Does the ratio vary with X? Plot ratio vs X to look for an
envelope.
5. **Empirical fit attempt** (if ratio pattern is systematic):
try candidate corrections in this order:
- **Threshold on Y:** if Y < Y_min, set Qs = 0. Fit Y_min from
data (shoulder months that worksheet zeroes give the
threshold).
- **Useful gain factor:** Qs_corrected = Qs × max(0, 1 k/Y).
Fit k.
- **X-validity clamp:** if X > X_max, apply a different rule.
- **Tank loss subtraction:** Qs_corrected = Qs k·H17. Fit k.
6. **Decision point:**
- **If empirical fit closes all 3 certs + cert 000565 to <50
kWh/yr residual:** ship as a spec-citation-pending slice with
`# TODO(EN-15316-verification)` comments + commit message
noting empirical-pending. Update
[`BRIEF_APPENDIX_H_EN_15316_RESEARCH.md`](BRIEF_APPENDIX_H_EN_15316_RESEARCH.md)
with the fitted-correction findings so a future research
trip can verify.
- **Otherwise:** hold and wait for BS EN 15316-4-3:2017 access.
Document the failed-fit attempts in the brief.
### Cert 000565 HW integration (only after the formula bug closes)
Wire the orchestrator into
[`domain/sap10_calculator/worksheet/water_heating.py:943`](../worksheet/water_heating.py#L943)
which currently hardcodes `solar_monthly_kwh=zero12`. This is the
step that lets cert 000565's HW pin go from +272 → ~0.
**DO NOT integrate the orchestrator at the current 1.81× over-
estimate.** The handover predicts this would *worsen* cert 000565's
HW residual from +272 → 131 (overshoot in the negative direction).
## What NOT to do
- Don't re-verify the work the prior agent already verified:
- H1-H8 input matching to worksheet (verified to 4 d.p.)
- Polynomial coefficients vs Table H3 (verbatim match)
- (H7) flux conversion vs Appendix U §U3.3 (verified)
- SAP spec formula transcription (verbatim)
- The H3=4.0, H4=0.01 default source (RdSAP 10 §10.11 Table 29
p.58 — found this session)
- ChatGPT-mediated research on EN 15316-4-3 already established
the polynomial coefficients match the prEN draft Table B.1
(so the polynomial isn't the bug)
- Don't chase the 12 gas-combi PV certs or the 5 SAP-residual certs
without worksheets — user has explicitly de-prioritised those.
- Don't reference SAP 10.3 ([[feedback-sap-10-2-only-never-10-3]]).
## Standard workflow per slice
1. Read SAP 10.2 spec page for the change — quote it in commit
2. Probe current cascade output, identify exact spec-vs-cascade gap
3. Write failing test FIRST (AAA structure)
4. Implement helper / change
5. Verify test passes
6. Run full handover suite (command in handover doc)
7. Check pyright on touched files — net-zero from baseline
8. Commit with spec citation
9. Update relevant memory if state changed
## How to run the baseline
```bash
PYTHONPATH=/workspaces/model python -m pytest \
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py \
backend/documents_parser/tests/test_elmhurst_extractor.py \
backend/documents_parser/tests/test_elmhurst_end_to_end.py \
domain/sap10_calculator/worksheet/tests/test_e2e_elmhurst_sap_score.py \
domain/sap10_calculator/worksheet/tests/test_appendix_h_solar.py \
domain/sap10_calculator/rdsap/tests/test_cert_to_inputs.py \
domain/sap10_calculator/rdsap/tests/test_golden_fixtures.py \
--no-cov -q
```
Expected: **547 pass + 9 expected `test_sap_result_pin[000565-*]`
fails** (the 9 cert 000565 cascade-gap pins).
## Memory hygiene
After the next slice, update:
- `project_cert_000565_recovery_state` — add Appendix H magnitude
outcome (empirical fit landed, or wait-for-EN documented).
- `project_golden_coverage_state` — HEAD update.
Good luck.