Model/backend/documents_parser/tests/fixtures/Summary_000474.pdf
Khalim Conn-Kowlessar ccf7aa2118 Scaffold: end-to-end Summary→EpcPropertyData chain test for 000474 (xfail)
The 6 worksheet fixtures build EpcPropertyData by hand, validating the cascade in isolation from the mapper. This commit lands the first half of the OTHER validation: Summary_000474.pdf → ElmhurstSiteNotesExtractor → from_elmhurst_site_notes → EpcPropertyData, asserting it produces the same shape as the hand-built fixture. Test is strict-xfail on sap_building_parts count (mapper produces 1, cert lodges 3). Includes a pdftotext-layout preprocessor that converts spatial label/value layout into the Textract-style sequence the existing extractor expects (test-only). Full punch list of 28 mapper-output diffs captured in project memory.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 17:40:06 +00:00

79 KiB