survey-extraction

mirror of https://github.com/Hestia-Homes/survey-extraction.git synced 2026-06-08 11:17:29 +00:00

History

Jun-te Kim 2fbc330c7c use formula to get each section correctly		2025-03-12 13:20:19 +00:00
..
pdfReader	use formula to get each section correctly	2025-03-12 13:20:19 +00:00
scraper	added new files to allow data extraction	2025-03-11 11:06:43 +00:00
tests	added basic poetry set up	2025-03-03 12:00:25 +00:00
utils	save scraped files	2025-03-10 18:55:17 +00:00
validator	no more etl/src	2025-03-08 06:38:48 +00:00
__init__.py	moved files	2025-03-08 06:29:07 +00:00
main.py	made into a function for two column things	2025-03-12 12:54:48 +00:00
README.md	added a scraper class to do some calculation outside of script	2025-03-05 14:00:56 +00:00

README.md

ETL

Extract, transform and load DATA

We get data from multiple places and merge them into one place.

Definition of multiple places: - Retro Team Sharepoint - Future Osmosis Sharepoint

Definition of one place: - into a CSV...today (03/03/2025)

Added sharepointclient that khalim made - Need to proof it works
Read a file from what khalim has shared

Add a local file:

mount a local folder directory wiht what Khalim sharepoint he has shared
REad files and file path

Once I have sharepoint api working:

[] Make validator for retro team
[] once validated, produce a csv file
[] show some cool productivity metric

Currently working on:

[On hold until i get sharepoint working] Validator
- check names
- [in progress, blocked unitl sharepoint. Easy to add] check it has dates
Useful file reader:
- Khalim showed me a useful pdf, that I should try to extract and get some information
[] Share point connection Figure out how to use the sharepoint connector
With Khalim:
Check if I have access to sharepoint
[] Try and get his client API working and see if I can read files

MVP: Script we can run that will Go to share point fetch all the data ( in progress ) provide some form of output that shows the number of surverys done (Get this information!!!)

Flat table

Billing: Billing table, left join