mirror of
https://github.com/Hestia-Homes/Model.git
synced 2026-06-08 11:17:27 +00:00
The §11 Windows table in the Summary PDF doesn't lay out identically across the cohort. Three new quirks added to the layout-style parser so the remaining 5 certs can be debugged with windows actually extracted: 1. `Wood 0.70` combined frame_type+frame_factor line — previously the parser expected them on separate lines (data+1 / data+2) and rejected the window when the joined form appeared. 2. Trailing glazing-type on the data line — `1.22 1.76 2.15 Double pre 2002` is the joined-cell variant in 000516; the W/H/Area anchor now captures the trailing phrase as an optional 4th group and feeds it through as `inline_glazing_type`, bypassing the separate-line glazing-prefix scan. 3. Cross-window gap with no glazing marker — `_partition_after_manuf` now falls back to "second orientation token in gap" when no glazing-type-prefix word appears. Covers the 000516 layout where each window has prefix+suffix orient tokens (no inline orient) and the glazing-type is joined-to-data. The 5 remaining Summary PDFs are copied into `backend/documents_parser/tests/fixtures/` ready for per-cert mapper work. Mirror pin tests deferred — each cert still has its own diff to close (handover in NEXT_AGENT_PROMPT.md documents the per-cert state, e.g. 000477 needs secondary-heating extraction, 000516 needs roof-window separation). Current cohort SAP deltas vs the U985 worksheet PDFs (target 1e-4): 000474 0.0000 ✓ 000477 +6.3655 secondary heating + lighting 000480 +8.2695 diagnosis pending 000487 +8.1433 extractor still drops windows 000490 +5.6551 diagnosis pending 000516 +5.9812 roof-window separation Wider regression stays green (754 pass). Pyright net-zero on touched files. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com> |
||
|---|---|---|
| .. | ||
| HANDOVER_NEXT.md | ||
| NEXT_AGENT_PROMPT.md | ||
| pcdb10.dat | ||
| pcdb_table_105_gas_oil_boilers.jsonl | ||
| pcdb_table_122_solid_fuel_boilers.jsonl | ||
| pcdb_table_143_micro_cogen.jsonl | ||
| pcdb_table_313_flue_gas_heat_recovery.jsonl | ||
| pcdb_table_353_waste_water_heat_recovery.jsonl | ||
| pcdb_table_362_heat_pumps.jsonl | ||
| pcdb_table_391_high_heat_retention_storage_heaters.jsonl | ||
| pcdb_table_506_heat_interface_units.jsonl | ||
| rdsap-10-specification-2025-06-10.pdf | ||
| sap-10-2-full-specification-2025-03-14.pdf | ||
| sap-10-3-full-specification-2026-01-13.pdf | ||
| SAP_CALCULATOR.md | ||