Model/backend/condition
2026-02-06 16:04:57 +00:00
..
domain start making changes to deploy handler 2026-02-04 12:16:47 +00:00
handler including environment in vars not environment_vars 2026-02-06 16:04:57 +00:00
lookups plug everything into in handler 2026-02-04 15:14:26 +00:00
parsing plug everything into in handler 2026-02-04 15:14:26 +00:00
persistence Local postgres within devcontainer, plus fixes so data loads into db 2026-01-29 16:11:16 +00:00
tests fix broken tests 2026-02-04 14:59:16 +00:00
utils handle dates as strings 2026-01-19 16:41:13 +00:00
__init__.py Define simple local runner 2026-01-16 17:28:28 +00:00
condition_trigger_request.py plug everything into in handler 2026-02-04 15:14:26 +00:00
local_runner.py get file type from request body 2026-02-04 14:29:43 +00:00
processor.py get file type from request body 2026-02-04 14:29:43 +00:00
README.md update readme 2026-01-30 09:50:33 +00:00

Condition Data Processor

The Condition Data Processor performs the following steps:

  • Extract

    • Ingest client Condition Survey data files (currently from local files; future support planned for S3 and internal survey sources)
    • Parse input files into Data Transfer Objects (DTOs)
  • Transform

    • Map source data into the internal domain data model
  • Load

    • Persist transformed data into the ARA database (not yet implemented)

The processor currently supports file formats provided by Peabody and LBWF.


Running Locally

The local_runner script allows the processor to be executed in a local environment.

  1. Copy sample input file(s) into the sample_data/ directory. If working with Peabody data, you'll need the Landlord Reference / UPRN lookup file as well.
  2. Update local_runner.py as required, specifically the definitions of:
    • lbwf_path
    • peabody_path
    • file_paths
  3. Run local_runner.py.
    Breakpoints may be added and the script run in debug mode if required.

Known Data Issues

Some inconsistencies exist in the source datasets, primarily involving multiple representations of the same logical element within a single file. In these cases, assumptions have been made in order to normalise the data into the internal domain model.

Peabody Data Wall Finish Mapping

In the original Peabody sample dataset, multiple Element/Sub-Element combinations correspond to wall finishes:

Element_Code Element Sub_Element_Code Sub_Element
53 External 23 Primary Wall Finish
53 External 30 Secondary Wall Finish
120 WALLS 2 Wall Finish

A single property may contain records for all three combinations, and each combination may appear multiple times.

For example, the property at 55 Burnaby Street, London contains entries for all three of the above combinations. However, it contains only a single entry for “WALLS: Wall structure”, indicating that the property has only one structure rather than multiple.

This pattern is also observed in other sampled properties. Based on this, the following assumption is applied:

  • “Secondary” refers to a secondary finish, not a secondary wall.

As a result:

  • The property is mapped to a single Wall element.
  • That Wall element is assigned three Finish aspects:
    • Two with aspect_instance = 1
    • One with aspect_instance = 2

This means that the combination of
UPRN / ElementType / ElementInstance / AspectType / AspectInstance
is not guaranteed to be unique.

LBWF Data Wall Finish Mapping

In the LBWF dataset, the following element codes map to wall finishes:

  • EXTWALLFN1
  • EXTWALLFN2

These are similarly mapped as multiple instances of the Finish aspect for a single Wall element.