Commit graph

1818 commits

Author SHA1 Message Date
Khalim Conn-Kowlessar
066dce19e3 Slice 46b: Elmhurst extractor parses windows from layout-style Summary PDFs
The legacy `_extract_windows` regex anchors on "Permanent Shutters\n" which is broken across lines by the pdftotext-layout preprocessor. New fallback `_extract_windows_from_layout` anchors on the two stable per-window markers — a "W H Area" data line and the "Manufacturer <U_value>" line a few lines further down — and tolerates the variable-order optional fields (glazing_gap, inline building_part, inline orientation) between them. Prefix/suffix tokens around the data block are re-joined into glazing_type / building_part / orientation strings.

Cert U985-0001-000474's 7 windows across Main + 2 extensions now flow through the mapper to EpcPropertyData.sap_windows (was 0). Textract-style extraction (existing fixture) is unchanged — the legacy path runs first and only falls through when its regex misses.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 18:03:29 +00:00
Khalim Conn-Kowlessar
36f2c7bbdf Slice 46a: Elmhurst mapper handles multi-bp Summary PDFs — Summary_000474 chain test flips green
ElmhurstSiteNotes had no representation for extensions: singular dimensions / walls / roof / floor fields could only describe the main bp. Summary PDFs lodge "1st Extension" / "2nd Extension" subsections in §4, §7, §8, §9 with optional "As Main: Yes" inheritance. This slice:

- Adds `ExtensionPart` dataclass and `ElmhurstSiteNotes.extensions: List[ExtensionPart]`.
- Adds `_split_section_by_bp` helper + per-bp parsing of dimensions / walls / roof / floor in the extractor; "As Main" inherits from the main bp.
- Refactors `_map_elmhurst_building_part` into a parameterised builder; adds `_map_elmhurst_building_parts` that yields Main + one SapBuildingPart per extension (capped at 4 per RdSAP10 §1.2).
- Scaffold test `test_summary_000474_mapper_produces_three_building_parts` flips from strict-xfail to passing.

Single-bp behaviour is unchanged (empty extensions list defaults). 752 existing tests stay green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 17:55:13 +00:00
Khalim Conn-Kowlessar
ccf7aa2118 Scaffold: end-to-end Summary→EpcPropertyData chain test for 000474 (xfail)
The 6 worksheet fixtures build EpcPropertyData by hand, validating the cascade in isolation from the mapper. This commit lands the first half of the OTHER validation: Summary_000474.pdf → ElmhurstSiteNotesExtractor → from_elmhurst_site_notes → EpcPropertyData, asserting it produces the same shape as the hand-built fixture. Test is strict-xfail on sap_building_parts count (mapper produces 1, cert lodges 3). Includes a pdftotext-layout preprocessor that converts spatial label/value layout into the Textract-style sequence the existing extractor expects (test-only). Full punch list of 28 mapper-output diffs captured in project memory.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-24 17:40:06 +00:00
Khalim Conn-Kowlessar
883028c89e P6.1 follow-on: unbox BuildingPartIdentifier at backend boundaries
Threads the strict BuildingPartIdentifier type (introduced in a8b443f6)
through the two remaining backend touchpoints:

- EpcBuildingPartModel.from_*: SQLModel column expects a string, so
  unbox the enum with .identifier.value before binding to the DB.
- documents_parser end-to-end tests: swap bare-string equality
  ("main" / "extension_1") for identity checks against the enum
  members (BuildingPartIdentifier.MAIN / EXTENSION_1).

Documents_parser test pack passes (105/105). No dedicated SQLModel test
covers EpcBuildingPartModel.from_*; the .value line is exercised
transitively via db_writer.py / local_runner.py in production.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-20 09:58:23 +00:00
Khalim Conn-Kowlessar
0ffda529ec slice 15a: add wall/floor/roof + demand scalar features for retrofit simulation
15 new features wired through schema -> domain -> mapper -> transform:

Main Dwelling fabric (11):
  - wall_insulation_type, wall_insulation_thickness_mm, wall_dry_lined,
    wall_thickness_mm, party_wall_construction
  - roof_insulation_location, roof_insulation_thickness_mm
  - floor_construction, floor_insulation, floor_insulation_thickness_mm,
    floor_heat_loss

Dwelling-level scalars (4):
  - multiple_glazed_proportion, number_baths, number_baths_wwhrs,
    extract_fans_count

Thickness strings like '50mm'/'NI'/'ND' parsed via _parse_thickness_mm; NI
(no insulation) lands as 0mm so the model sees the physical zero rather than
a missing value. Categorical sentinels ('NA'/'NI'/'ND') become None.

Also fixed long-standing typo `multiple_glazed_propertion` -> `_proportion`
in domain dataclass + its lone DB-model usage.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 22:08:27 +00:00
Jun-te Kim
27b5602608 remove pandas 2026-05-13 15:08:06 +00:00
Jun-te Kim
51460d1cd3 route at th ebeginnign 2026-05-13 14:47:24 +00:00
Jun-te Kim
54864bf102 resolve merge conflict 2026-05-13 14:22:04 +00:00
Jun-te Kim
2fb6a99956 throttle added 2026-05-13 14:02:36 +00:00
Jun-te Kim
ff4ad07a2b retry 2026-05-13 11:41:21 +00:00
Jun-te Kim
c347865b9e retry 2026-05-13 09:34:51 +00:00
Khalim Conn-Kowlessar
566c70077a removing redundant code 2026-05-13 08:40:51 +00:00
Khalim Conn-Kowlessar
3fd7321337 remove comment 2026-05-13 08:40:51 +00:00
Jun-te Kim
3bcb94f9e5 Merge branch 'main' into feature/integrate_new_epc_with_historical_epc 2026-05-13 08:38:50 +00:00
Jun-te Kim
09dbfe2106 fix dependency issue 2026-05-12 17:03:16 +00:00
Jun-te Kim
e458f0a2b7 task and sub tasks imrpvoed 2026-05-12 16:24:11 +00:00
Jun-te Kim
dfc100f78b rank address similiarity 2026-05-12 16:02:01 +00:00
Daniel Roth
5f77fbf4e4 Fetch all pages in get_plans pagination loop 🟪 2026-05-12 14:54:14 +00:00
Daniel Roth
0d324f99b2 Fetch all pages in get_plans pagination loop 🟩 2026-05-12 14:52:46 +00:00
Daniel Roth
6dfca082f8 Fetch all pages in get_plans pagination loop 🟥 2026-05-12 14:52:31 +00:00
Daniel Roth
f83ddd05a8 Paginate get_plans to return flat list[PlanSummary] 🟩 2026-05-12 14:46:00 +00:00
Daniel Roth
62acc3ce98 Paginate get_plans to return flat list[PlanSummary] 🟥 2026-05-12 14:45:09 +00:00
Daniel Roth
8727a78f8b correct magic plan url paths 🟩 2026-05-12 14:33:58 +00:00
Daniel Roth
beaf21fdcc correct magic plan url paths 2026-05-12 14:32:37 +00:00
Daniel Roth
75d0313934 fix broken magicplan handler tests 2026-05-12 14:14:37 +00:00
Daniel Roth
04df924146 fix local invoker 2026-05-12 14:13:12 +00:00
Jun-te Kim
8b27a5173b fix typo for rate limit error 2026-05-12 14:05:30 +00:00
Daniel Roth
3df726937e Remove unused _api_key instance variable now auth is fully header-based 🟪 2026-05-12 14:03:07 +00:00
Daniel Roth
eb381a778c _fetch_plan() sends no API key query parameter 🟩 2026-05-12 14:02:17 +00:00
Daniel Roth
7752039dbd _fetch_plan() sends no API key query parameter 🟥 2026-05-12 14:01:40 +00:00
Daniel Roth
20b32bcda0 get_plans() sends no API key query parameter 🟩 2026-05-12 14:01:35 +00:00
Daniel Roth
ffcff33dd4 get_plans() sends no API key query parameter 🟥 2026-05-12 14:00:07 +00:00
Daniel Roth
d59bf2d7cb Set API key as session header on MagicPlanClient construction 🟩 2026-05-12 13:59:33 +00:00
Daniel Roth
da4f5f44c0 Set API key as session header on MagicPlanClient construction 🟥 2026-05-12 13:58:16 +00:00
Daniel Roth
a672c0dea0 add localhandler for testing and update requirements 2026-05-12 13:51:46 +00:00
Jun-te Kim
27f2ef5e83 get rid of duplicate function and make better sensible variable name 2026-05-12 13:46:02 +00:00
Jun-te Kim
b0e935d497 make sensible naming for column for address column in df 2026-05-12 13:43:12 +00:00
Jun-te Kim
46ec68e5db save match building number 2026-05-12 13:41:59 +00:00
Jun-te Kim
5cd21d8522 get rid of khalim's json 2026-05-12 12:55:50 +00:00
Jun-te Kim
35d191c70e merged from main and resolved pytest.ini confict 2026-05-12 12:54:28 +00:00
Jun-te Kim
18ea95b67d added env variables for boto 2026-05-12 12:34:17 +00:00
Jun-te Kim
35fea20fc7 changed function name 2026-05-12 10:54:45 +00:00
Jun-te Kim
bec5c4f3c3 one place to have df_has_single_uprn 2026-05-12 10:51:27 +00:00
Jun-te Kim
b364df89ad forgot to add tuple typing 2026-05-12 10:31:54 +00:00
Jun-te Kim
f52fe001cc renamed file 2026-05-12 10:14:16 +00:00
Jun-te Kim
8635e2a1aa change file name of epc client service 2026-05-12 10:08:00 +00:00
Jun-te Kim
2c5c8337cc added more type hints 2026-05-12 10:01:25 +00:00
Jun-te Kim
e06ead55d0 add more type hint 2026-05-12 09:48:21 +00:00
Jun-te Kim
b72d5fbf42 fix nitpick 2026-05-12 09:43:40 +00:00
Jun-te Kim
c22528299c added type hinting to uprn 2026-05-12 09:40:12 +00:00