Simulation System

Starter Readme: Steps for pipeline:

(WIP) Set up the training development environment
- Change directory to this folder (simulation_system)
- Run the following command make env PYTHON_VERSION=3.10.12
- This will install the specified python version using pyenv and select this version as the global python version
- It will install all training packages as specified in the training-dev.txt requirements file, including the pre-commit hooks
- Run source .training_env/bin/activate to use this environment
(WIP) Use Makefile to start up mock up s3 service
- By running make init, this will run the docker-compose build and docker-compose up -d, which spins up a S3 service
- This docker compose is running in detached mode -d, so will no output anything to the terminal
Once the Minio service is run, you can run the training.py file to start a model build process
- This will output a model, for a given target column, and add model name composed of some of the hyperparameters
- An example of running this file is:
  - python3 training.py --train-filepath ./model_build_data/change_data/rdsap_full/train_validation_data.parquet --test-filepath ./model_build_data/change_data/rdsap_full/test_data.parquet
- Outputs of the pipeline are:
  - A model directory bucket
  - A target variable prefix (i.e. RDSAP_CHANGE or HEAT_DEMAND_CHANGE)
  - A model type prefix (i.e. autogluon, tensorflow etc)
  - A model name prefix (i.e. rdsap_change_medium_quality_60_TIMESTAMP)
    - This model name is made up of target variable, quality, time spent training and timestamp
    - Within this prefix, there are three folders:
      - model
        
        The model path that can be loaded in the codebase
      - deployment
        
        The optimised model that can be deployed (may or maynot need this)
      - metrics
        
        The metrics generatted from the model (may or may not need this as this can be contained in the registry)
Once model build is finished, you can run the prediction.py file to generate prediction
- By default, the prediction pipeline will select the best model based on mean absolute error from the model registry
  - This can be overwritten by specifying a model_path, which will load an alternative model
- There are two ways of getting data into the pipeline:
  - Using the --data argument:
    - This is a JSON string which can be passed as python3 predictions.py --data '{"TOTAL_FLOOR_AREA": 1}'
      - Note the single and double quotation marks, as this affects the ingestion
  - Using the --data-path argument:
    - This can be a filepath (Can imagine that we might want to pull data from S3/ DB)
- An example of running the file is:
  - python3 predictions.py --data-path ../simulation_system/model_build_data/change_data/rdsap_full/test_data.parquet
- Outputs of the pipeline are:
  - prediction bucket
  - a Target variables prefix (i.e. RDSAP_CHANGE or HEAT_DEMAND_CHANGE)
  - a uprn prefix (i.e 0123456789)
  - a prediction.json
  - a metadata.json
    - This is all the metadata from the model (can change this if needed)
NOTE: If you wish to change any settings, these are currently all in the Settings.py file
- It will be separated out eventually but for now, it works to keep track of anything that we might want to respecify.
  - I.e. the hyperparameters for models are in here but will move into a separate configuration file

TODO:

Structure/ MLOps:
- Add configuration files (dev, staging, prod), including hyperparamters
- Add precommit hooks (linters, branch names, etc)
- Sphinx documentation
- Sort out local mock up services
- Sort out Model Registry
- Sort out Data version control
- pre-commit hooks:
  - The types of hooks that we want (safety, bandit, iso8 etc)
  - The customisations we require
- Add sphinx documentation
Data Science:
- Implement a metrics class, to hold all metric
- Rebuild metrics script (Could be a one off but good to have)
- Determine metrics
- Implement and test custom model (Tensorflow Decision Trees etc)
Orchestration:
- Lambda handler for the pipeline

4.3 KiB Raw Blame History

Simulation System

TODO:

4.3 KiB

Raw Blame History