Model/packages
Khalim Conn-Kowlessar 035d916dd6 Slice 70: cohort 000474 mapper-vs-hand-built diff is GREEN
Closes the final 49 → 0 diffs in two moves:

1. **Filter non-load-bearing SapWindow sub-fields from the diff.** The
   Elmhurst mapper surfaces Summary §11 strings (window_type='Window',
   glazing_type='Double between 2002 and 2021', glazing_gap='12 mm',
   data_source='Manufacturer', permanent_shutters_present='None')
   while the cohort `make_window` helper produces API-style int codes
   for the same fields. None of these affect the SAP cascade — it
   reads only window_width / window_height / orientation /
   window_location / frame_factor / window_transmission_details.
   {u_value, solar_transmittance}. Adding `_NON_LOAD_BEARING_WINDOW_
   SUBFIELDS` + `_is_excluded_path` to the diff helper drops them
   from the comparison without changing the load-bearing scope. Per
   the user's earlier "load-bearing only" decision — encoding noise
   that doesn't change the cascade output is excluded.

2. **`make_window` helper now defaults `frame_factor=0.7`.** The
   SAP10.2 Table 6c PVC default (and the modal value the Elmhurst
   mapper surfaces from Summary §11). Previously the helper left it
   `None`, which the cascade resolves to 0.7 internally; setting it
   explicitly is cascade-equivalent and closes the last 7 diffs.

Diff count for cohort 000474:
  Slice 63 baseline:    50
  Slice 64 (Cat A):     14
  Slice 65 (HW):        12
  Slice 66+67 (mapper):  5
  Slice 68 (party-wall): 1
  Slice 69 (windows):   49 (encoding-noise surface)
  Slice 70 (filter):     **0** — diff test now GREEN

`test_from_elmhurst_site_notes_matches_hand_built_000474` PASSES.
First cohort cert fully validated at the EpcPropertyData load-
bearing-field level. All 66 cohort cascade pins remain GREEN at
1e-4. Pyright net-zero (0 errors on touched files).

Next slices: parametrize the diff test over the 5 other cohort
certs (000477, 000480, 000487, 000490, 000516) — each may have
its own bulk-update + mapper-tweak pattern, but the toolchain
(diff helper, exclusion list, _LOAD_BEARING_FIELDS, helper
defaults) is in place. Then 001479 (after Slice 62 hand-built
hits 1e-4). Then the API mapper diff test (currently the API
mapper has its own gaps — Slice 58/59/60 cascade fixes closed
golden cert residuals but field-level cross-mapper parity isn't
asserted yet).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-25 17:09:39 +00:00
..
domain Slice 70: cohort 000474 mapper-vs-hand-built diff is GREEN 2026-05-25 17:09:39 +00:00
fetchers added potential file scaffolding: 2026-05-15 10:56:53 +00:00
repos added potential file scaffolding: 2026-05-15 10:56:53 +00:00
utils added potential file scaffolding: 2026-05-15 10:56:53 +00:00
README.md added potential file scaffolding: 2026-05-15 10:56:53 +00:00

Shared packages

Workspace packages consumed by services/*. Each package is its own Python distribution with its own pyproject.toml; services import via the workspace dependency mechanism ({ workspace = true }).

Package Purpose
domain/ Shared domain types — Property, BaselinePerformance, Plan, Scenario, EpcPropertyData, etc. No persistence, no IO, no business logic.
repos/ Persistence layer — one repo per aggregate. Owns the SQL. Depends on domain.
fetchers/ External API clients (gov EPC, Ofgem, Google Solar, etc.). Depend on domain for response shapes.
utils/ Cross-cutting infra — logging, S3, CloudWatch URL builders, SQS task helpers.

Adding a new shared package

Only when a real second consumer materialises. Don't pre-shatter (repos-epc, repos-property, ...) — split when a deployment needs to drop a dep, not before.

See ../ara_backend_design.md §11 for the broader monorepo layout and ../CONTEXT.md for the domain glossary that names the types living in domain/.