survey-extraction/etl
2025-06-20 16:05:01 +00:00
..
db hub spot in 2025-05-23 15:58:04 +00:00
files change my file type 2025-06-04 11:00:36 +00:00
hubSpotClient hub spot in 2025-05-23 15:58:04 +00:00
models access and elevations 2025-06-20 16:05:01 +00:00
osmosis_data update sharepoint automation script 2025-05-02 10:04:54 +00:00
pdfReader renaming table works now 2025-06-18 10:21:20 +00:00
ratecard/jjcRateCards made jjc automation work if I'm giving the survey sheet 2025-03-26 12:02:18 +00:00
scraper add todays day to logs so i can debug 2025-06-17 09:14:33 +00:00
surveyedData access and elevations 2025-06-20 16:05:01 +00:00
surveyPrice sgec support 2025-05-23 14:34:42 +00:00
temp csr report test done 2025-04-04 20:03:46 +00:00
tests post code and address are now seperate@ 2025-05-29 13:59:30 +00:00
transform access and elevations 2025-06-20 16:05:01 +00:00
utils post code and address are now seperate@ 2025-05-29 13:59:30 +00:00
validator make output nicer 2025-03-17 21:25:17 +00:00
__init__.py moved files 2025-03-08 06:29:07 +00:00
age_band_calculator.py file commited 2025-05-05 17:32:42 +00:00
condition_report_etl.py added the deletion of condition report 2025-06-18 15:44:59 +00:00
daily_script.py added a new script 2025-03-19 09:14:26 +00:00
development.py save changes 2025-03-20 07:39:52 +00:00
dimitra_hubspot_notes_gather.py merge this to main 2025-06-17 09:44:07 +00:00
expense_to_csv.py make expense to csv 2025-04-03 21:30:08 +00:00
hubspot_to_db.py get windturbine working 2025-05-08 11:58:27 +00:00
hubspot_to_invoice.py db migration and hubspot deal@ 2025-05-14 14:59:08 +00:00
hubspot_to_invoice_rewrite.py added ability to get file logs to sharepoint 2025-05-27 12:12:46 +00:00
hubspot_to_jjc_deemed_calculator.py made a price calcualtor 2025-04-10 13:57:41 +00:00
imagefilenamechcker.py solar now works 2025-04-23 14:35:45 +00:00
jjc_old_lewis_manual_way_.py save 2025-04-24 13:41:09 +01:00
load_metadata.py load db first 2025-04-01 12:21:53 +00:00
osmosis_complaince_address_to_files.py added condition report 2025-05-30 15:45:03 +00:00
osmosis_google_maps_.py db migration and hubspot deal@ 2025-05-14 14:59:08 +00:00
osmosis_monday_to_sharepoint_automation.py update sharepoint automation script 2025-05-02 10:04:54 +00:00
README.md added a scraper class to do some calculation outside of script 2025-03-05 14:00:56 +00:00
scis_invoice.py local changes 2025-03-21 07:30:51 +00:00
sgec_invoice.py added new scripts 2025-03-21 10:22:52 +00:00
simple_load_example.py do linking later 2025-05-08 11:36:40 +00:00

ETL

Extract, transform and load DATA

We get data from multiple places and merge them into one place.

Definition of multiple places: - Retro Team Sharepoint - Future Osmosis Sharepoint

Definition of one place: - into a CSV...today (03/03/2025)

  • Added sharepointclient that khalim made - Need to proof it works
  • Read a file from what khalim has shared

Add a local file:

  • mount a local folder directory wiht what Khalim sharepoint he has shared
  • REad files and file path

Once I have sharepoint api working:

  • [] Make validator for retro team
  • [] once validated, produce a csv file
  • [] show some cool productivity metric

Currently working on:

  • [On hold until i get sharepoint working] Validator

    • check names
    • [in progress, blocked unitl sharepoint. Easy to add] check it has dates
  • Useful file reader:

    • Khalim showed me a useful pdf, that I should try to extract and get some information
  • [] Share point connection Figure out how to use the sharepoint connector

  • With Khalim:

  • Check if I have access to sharepoint

  • [] Try and get his client API working and see if I can read files

MVP: Script we can run that will Go to share point fetch all the data ( in progress ) provide some form of output that shows the number of surverys done (Get this information!!!)

Flat table

Billing: Billing table, left join