survey-extraction/etl
2025-03-04 12:21:44 +00:00
..
src/etl added pdf reader 2025-03-04 12:21:44 +00:00
tests added basic poetry set up 2025-03-03 12:00:25 +00:00
poetry.lock added pdf reader 2025-03-04 12:21:44 +00:00
pyproject.toml added pdf reader 2025-03-04 12:21:44 +00:00
README.md save work 2025-03-03 17:22:14 +00:00
run_etl.sh got a very basic hello world working 2025-03-03 12:25:52 +00:00

ETL

Extract, transform and load DATA

We get data from multiple places and merge them into one place.

Definition of multiple places: - Retro Team Sharepoint - Future Osmosis Sharepoint

Definition of one place: - into a CSV...today (03/03/2025)

  • Added sharepointclient that khalim made - Need to proof it works
  • Read a file from what khalim has shared

Add a local file:

  • mount a local folder directory wiht what Khalim sharepoint he has shared
  • REad files and file path

Once I have sharepoint api working:

  • [] Make validator for retro team
  • [] once validated, produce a csv file
  • [] show some cool productivity metric

Currently working on:

  • [] Validator

    • check names
    • [in progress, blocked unitl sharepoint. Easy to add] check it has dates
  • [] Useful file reader:

    • [] Khalim showed me a useful pdf, that I should try to extract and get some information
  • With Khalim:

  • [] Check if I have access to sharepoint

  • [] Try and get his client API working and see if I can read files

MVP: Script we can run that will Go to share point fetch all the data provide some form of output that shows the number of surverys done

Flat table

Billing: Billing table, left join