Commit graph

14 commits

Author SHA1 Message Date
Khalim Conn-Kowlessar
ccf7aa2118 Scaffold: end-to-end Summary→EpcPropertyData chain test for 000474 (xfail)
The 6 worksheet fixtures build EpcPropertyData by hand, validating the cascade in isolation from the mapper. This commit lands the first half of the OTHER validation: Summary_000474.pdf → ElmhurstSiteNotesExtractor → from_elmhurst_site_notes → EpcPropertyData, asserting it produces the same shape as the hand-built fixture. Test is strict-xfail on sap_building_parts count (mapper produces 1, cert lodges 3). Includes a pdftotext-layout preprocessor that converts spatial label/value layout into the Textract-style sequence the existing extractor expects (test-only). Full punch list of 28 mapper-output diffs captured in project memory.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 17:40:06 +00:00
Daniel Roth
6c70c5a535 Extract address when Property photo element is missing from PDF 🟩 2026-04-30 16:25:41 +00:00
Daniel Roth
8f94bb5435 extract window frame details from elmhurst site notes 🟥 2026-04-27 15:50:25 +00:00
Daniel Roth
a8579db4d9 elmhurst site notes fixture 2026-04-24 13:09:30 +00:00
Daniel Roth
e15646c341 rename example site notes to PasHub_ and add Elmhurst example 2026-04-24 13:01:51 +00:00
Daniel Roth
146b5999dc additional fields mapped from pdf 6 🟩 2026-04-23 11:23:29 +00:00
Daniel Roth
2bf8bc6d5e additional fields mapped from pdf 5 🟩 2026-04-23 11:12:25 +00:00
Daniel Roth
c5c3f3fc83 additional fields mapped from pdf 4 🟥 2026-04-23 10:36:15 +00:00
Daniel Roth
7700aac5bb extract cylinder thermostat 🟥 2026-04-21 15:03:12 +00:00
Daniel Roth
da26e4a4cb extract no extensions 🟥 2026-04-21 10:53:14 +00:00
Daniel Roth
ac854f161a extract water heating cylinder thickness 🟥 2026-04-21 10:40:07 +00:00
Daniel Roth
0a8b9e0767 Include inspection metadata in output 2026-04-20 09:04:54 +00:00
Daniel Roth
bc527a039f site notes pdf to json 🟥 2026-04-16 14:45:28 +00:00
Daniel Roth
4f3c7894ae Map to RdSapSiteNotes from site notes JSON 🟥 2026-04-16 13:54:03 +00:00