mirror of
https://github.com/Hestia-Homes/Model.git
synced 2026-06-08 11:17:27 +00:00
61 lines
No EOL
2.5 KiB
Markdown
61 lines
No EOL
2.5 KiB
Markdown
So you want to fetch UPRN for an address list?
|
|
|
|
|
|
Before you run:
|
|
|
|
Step 1) Get the list and ensure the following columns exists
|
|
|
|
I believe lower and upper case matter:
|
|
* Address 1
|
|
* Address 2
|
|
* Address 3
|
|
* postcode
|
|
|
|
And save it as a .csv file
|
|
|
|
|
|
Step 2)
|
|
|
|
Before we run this, we need to upload it into S3 as well as put initiate a subtask + task
|
|
|
|
* S3 upload I'll recommend somewhere in retrofit-data-dev and get the s3_uri
|
|
|
|
For this example I'll be using "s3://retrofit-data-dev/ara_raw_inputs/calico/Calico Homes Full list EPC Properties(Sheet2) (1) (1).csv"
|
|
|
|
Go to Ara DB and make a new task_id with a randomly generated uuid as the primarily key
|
|
|
|
task_id = ea615ac3-ac28-46c4-8bff-2431c5b9c13d
|
|
sub_task_id = 85a23b67-8f18-4299-9bf0-69bfb87adbc7
|
|
s3 => s3://retrofit-data-dev/ara_raw_inputs/eon/North Tyneside Council.csv
|
|
|
|
Step 3) Alright, now lets make the input for postcode-splitter sqs to get the ball rolling
|
|
postcode-splitter-sqs => https://eu-west-2.console.aws.amazon.com/sqs/v3/home?region=eu-west-2#/queues/https%3A%2F%2Fsqs.eu-west-2.amazonaws.com%2F337213553626%2Fpostcode-splitter-queue-dev
|
|
|
|
{
|
|
"sub_task_id": "c5afbd49-f0cd-4930-82bf-bafc5243a34a",
|
|
"task_id": "67a4b3f0-cc7a-4e8a-b314-deb783e0eedb",
|
|
"s3_uri": "s3://retrofit-data-dev/ara_raw_inputs/eon/pickering ferens/Pickering Ferens - SHDF W3 Post Bid Stage MDS - Template - Vr4(in).csv"
|
|
}
|
|
|
|
Each batch of csv should be saved in retrofit-data-dev/ara_postcode_splitter_batches/<task-id>/<sub-task-id>/<timestamp:uuid4>.csv
|
|
|
|
outputs of address2uprn ( which is automatically triggered on postcodesplitter) will be saved on retrofit-data-dev/ara_raw_outputs/<task-id>/<subtask-id>/<timestamp:uuid4>.csv
|
|
|
|
|
|
Run the script in backend/scripts/combine_address2uprn_outputs.py with <task-id>.
|
|
This will combine all the outputs of the files for each address2uprn into one big file
|
|
|
|
Find out which ones have missing uprn and save that as a seperate sheet and save it somewhere in s3://retrofit-data-dev
|
|
|
|
I uploaded the missing uprn here: s3://retrofit-data-dev/ara_raw_inputs/calico/missinguprn.csv
|
|
|
|
ordnance_survey sqs is => https://eu-west-2.console.aws.amazon.com/sqs/v3/home?region=eu-west-2#/queues/https%3A%2F%2Fsqs.eu-west-2.amazonaws.com%2F337213553626%2FordnanceSurvey-queue-dev
|
|
|
|
{
|
|
"s3_uri": "s3://retrofit-data-dev/ara_raw_inputs/eon/beyond_housing/Book(Sheet1).csv",
|
|
"task_id": "ccdec0d1-ebf3-484f-b2ae-397200dd25da",
|
|
"sub_task_id": "569d41f6-45cd-4e64-a586-eb8c2097375d"
|
|
}
|
|
|
|
|
|
outputs are at s3://retrofit-data-dev/ara_ordnance_survey_outputs/<task-id>/<sub-task-id> |