Model/backend/address2UPRN
2026-05-12 09:37:23 +00:00
..
handler only in docker build 2026-02-12 18:52:11 +00:00
tests refactored test to deal with flats better 2026-05-11 16:23:03 +00:00
__init__.py need to download grade pydantic 2026-01-20 20:12:37 +00:00
main.py added imports at the top of the file instead of function 2026-05-12 09:37:23 +00:00
README.md added coordination comments 2026-04-08 14:47:31 +00:00
scoring.py demo generated for use in address2uprn 2026-05-08 14:48:15 +00:00
script.py added commnets on script 2026-02-16 14:12:09 +00:00

So you want to fetch UPRN for an address list?

Before you run:

Step 1) Get the list and ensure the following columns exists

I believe lower and upper case matter:

  • Address 1
  • Address 2
  • Address 3
  • postcode

And save it as a .csv file

Step 2)

Before we run this, we need to upload it into S3 as well as put initiate a subtask + task

  • S3 upload I'll recommend somewhere in retrofit-data-dev and get the s3_uri

For this example I'll be using "s3://retrofit-data-dev/ara_raw_inputs/calico/Calico Homes Full list EPC Properties(Sheet2) (1) (1).csv"

Go to Ara DB and make a new task_id with a randomly generated uuid as the primarily key

task_id = ea615ac3-ac28-46c4-8bff-2431c5b9c13d sub_task_id = 85a23b67-8f18-4299-9bf0-69bfb87adbc7 s3 => s3://retrofit-data-dev/ara_raw_inputs/eon/North Tyneside Council.csv

Step 3) Alright, now lets make the input for postcode-splitter sqs to get the ball rolling postcode-splitter-sqs => https://eu-west-2.console.aws.amazon.com/sqs/v3/home?region=eu-west-2#/queues/https%3A%2F%2Fsqs.eu-west-2.amazonaws.com%2F337213553626%2Fpostcode-splitter-queue-dev

{ "sub_task_id": "c5afbd49-f0cd-4930-82bf-bafc5243a34a", "task_id": "67a4b3f0-cc7a-4e8a-b314-deb783e0eedb", "s3_uri": "s3://retrofit-data-dev/ara_raw_inputs/eon/pickering ferens/Pickering Ferens - SHDF W3 Post Bid Stage MDS - Template - Vr4(in).csv" }

Each batch of csv should be saved in retrofit-data-dev/ara_postcode_splitter_batches///timestamp:uuid4.csv

outputs of address2uprn ( which is automatically triggered on postcodesplitter) will be saved on retrofit-data-dev/ara_raw_outputs///timestamp:uuid4.csv

Run the script in backend/scripts/combine_address2uprn_outputs.py with . This will combine all the outputs of the files for each address2uprn into one big file

Find out which ones have missing uprn and save that as a seperate sheet and save it somewhere in s3://retrofit-data-dev

I uploaded the missing uprn here: s3://retrofit-data-dev/ara_raw_inputs/calico/missinguprn.csv

ordnance_survey sqs is => https://eu-west-2.console.aws.amazon.com/sqs/v3/home?region=eu-west-2#/queues/https%3A%2F%2Fsqs.eu-west-2.amazonaws.com%2F337213553626%2FordnanceSurvey-queue-dev

{ "s3_uri": "s3://retrofit-data-dev/ara_raw_inputs/eon/beyond_housing/Book(Sheet1).csv", "task_id": "ccdec0d1-ebf3-484f-b2ae-397200dd25da", "sub_task_id": "569d41f6-45cd-4e64-a586-eb8c2097375d" }

outputs are at s3://retrofit-data-dev/ara_ordnance_survey_outputs//