# Condition Data Processor The Condition Data Processor performs the following steps: - **Extract** - Ingest client Condition Survey data files (currently from local files; future support planned for S3 and internal survey sources) - Parse input files into Data Transfer Objects (DTOs) - **Transform** - Map source data into the internal domain data model - **Load** - Persist transformed data into the ARA database (not yet implemented) The processor currently supports file formats provided by **Peabody** and **LBWF**. --- ## Running Locally The `local_runner` script allows the processor to be executed in a local environment. 1. Copy sample input file(s) into the `sample_data/` directory. If working with Peabody data, you'll need the Landlord Reference / UPRN lookup file as well. 2. Update `local_runner.py` as required, specifically the definitions of: - `lbwf_path` - `peabody_path` - `file_paths` 3. Run `local_runner.py`. Breakpoints may be added and the script run in debug mode if required. --- ## Known Data Issues Some inconsistencies exist in the source datasets, primarily involving multiple representations of the same logical element within a single file. In these cases, assumptions have been made in order to normalise the data into the internal domain model. ### Peabody Data – Wall Finish Mapping In the original Peabody sample dataset, multiple Element/Sub-Element combinations correspond to wall finishes: | Element_Code | Element | Sub_Element_Code | Sub_Element | |--------------|----------|------------------|-----------------------| | 53 | External | 23 | Primary Wall Finish | | 53 | External | 30 | Secondary Wall Finish | | 120 | WALLS | 2 | Wall Finish | A single property may contain records for all three combinations, and each combination may appear multiple times. For example, the property at **55 Burnaby Street, London** contains entries for all three of the above combinations. However, it contains only a single entry for *“WALLS: Wall structure”*, indicating that the property has only one structure rather than multiple. This pattern is also observed in other sampled properties. Based on this, the following assumption is applied: - “Secondary” refers to a secondary **finish**, not a secondary **wall**. As a result: - The property is mapped to a single Wall element. - That Wall element is assigned three Finish aspects: - Two with `aspect_instance = 1` - One with `aspect_instance = 2` This means that the combination of `UPRN / ElementType / ElementInstance / AspectType / AspectInstance` is **not guaranteed to be unique**. ### LBWF Data – Wall Finish Mapping In the LBWF dataset, the following element codes map to wall finishes: - `EXTWALLFN1` - `EXTWALLFN2` These are similarly mapped as multiple instances of the **Finish** aspect for a single Wall element. ---