mirror of
https://github.com/Hestia-Homes/Model.git
synced 2026-06-08 11:17:27 +00:00
Refactors Elmhurst `Renewables` PV detail from four scalar fields
(pv_peak_power_kw / pv_orientation / pv_elevation_deg / pv_overshading
— single-array shape) to `pv_arrays: List[ElmhurstPvArray]`, then
walks the §19.0 PV Panel block in 4-tuples so dwellings with multiple
PV arrays surface every array.
Forced by cert 0350-2968-2650-2796-5255 (Summary_000903.pdf), the
second ASHP cohort cert through the Summary path and first to lodge
multiple PV arrays — the dr87 worksheet pins 2 arrays at 1.50 kWp
each (one SE at 45°, one NW at 45°). Pre-slice the extractor's
hardcoded "break at len(values) == 4" capped output at one array
regardless of how many the PDF lodged.
Three-layer end-to-end change:
1. `datatypes/epc/surveys/elmhurst_site_notes.py` — add
`ElmhurstPvArray` dataclass (kw, orientation, elevation_deg,
overshading); replace four `Renewables.pv_*` scalars with
`pv_arrays: List[ElmhurstPvArray] = field(default_factory=list)`.
2. `backend/documents_parser/elmhurst_extractor.py` — rename
`_extract_pv_array_detail` → `_extract_pv_arrays`; walk values
after the "Photovoltaic panel details" anchor in 4-tuples until a
stop token ("batteries"/"export"/etc.) or a §-header closes the
block. §-header regex tightened to `\d{1,2}\.\d\s+\w` so kWp
values like "1.50" don't trip the close (without the `\s+\w` the
regex matched both "20.0 Wind Turbine" AND "1.50").
3. `datatypes/epc/domain/mapper.py` — `_elmhurst_pv_arrays` iterates
the list and emits one `PhotovoltaicArray` per row; collapses
empty list → None so the cascade keeps its no-PV fallback.
Forcing function: cert 0350 first-attempt Summary SAP closes from
Δ -4.5829 (Slice 8 baseline) to Δ **+0.0458** — within the ±0.07
ASHP-cohort spec-precision floor. PV export credit GBP moves from
158.91 (one array surfaced) to 265.99 (both arrays surfaced) — the
extra ~107 GBP of avoided cost lifts cert 0350's SAP by ~4.6 points.
This validates the structural-debt-amortizes hypothesis: cert 0350
needed only TWO new slices (S0380.8 inheritance + S0380.9 multi-PV)
beyond the cert 0380 closure work, vs cert 0380's 6 slices from
scratch. Subsequent cohort certs should converge similarly fast as
fixture-specific gaps are paid down.
Added two tests:
- `test_summary_0350_surfaces_two_pv_arrays` — unit test pinning
the multi-array contract on the mapper boundary.
- `test_summary_0350_full_chain_sap_within_spec_floor_of_worksheet`
— chain test pinning Δ < ±0.07 (matches cert 0380's chain test).
Cert 0380 (single-array, 3 kWp) continues to pass its chain test +
all 6 unit-level pins — the refactor preserves single-array behaviour.
Pyright net-zero across all four edited files:
datatypes/epc/domain/mapper.py: 32 (baseline)
datatypes/epc/surveys/elmhurst_site_notes.py: 0
backend/documents_parser/elmhurst_extractor.py: 0
backend/documents_parser/tests/test_summary_pdf_mapper_chain.py: 0
Regression suite: 677 pass + 10 fail (= handover baseline 669 + 10
+ 8 new GREEN unit+chain tests across Slices S0380.2..S0380.9).
Fixtures added: `backend/documents_parser/tests/fixtures/Summary_
000903.pdf` (copied from `sap worksheets/Additional data with api/
0350-2968-2650-2796-5255/`).
Spec refs:
- SAP 10.2 Appendix M (PDF p.103) — multiple PV arrays sum to total
electricity generation per Equation M-1 (each array's surface flux
computed independently per Appendix U3.3).
- SAP 10.2 Appendix U3.3 (PDF p.124) — per-array surface flux keyed
on orientation + tilt + overshading.
- Cert 0350 worksheet `dr87-0001-000903.pdf` (29a Main 19.4575 W/K
+ Ext1 1.3025 W/K = 20.7600 ≡ Summary cascade walls_w_per_k; (39)
avg HTC 173.4202 ≡ Summary cascade; (64) HW 2084.66 ÷ (216) HW eff
1.7285 = 1206.04 ≡ Summary cascade hot_water_kwh_per_yr).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| handler | ||
| tests | ||
| __init__.py | ||
| db_writer.py | ||
| elmhurst_extractor.py | ||
| extractor.py | ||
| local_runner.py | ||
| parser.py | ||
| pdf.py | ||