Add extra steps to readme and makefile

This commit is contained in:
Michael Duong 2023-08-29 18:16:46 +01:00
parent b2f606055f
commit 469938bb25
2 changed files with 21 additions and 5 deletions

View file

@ -13,8 +13,13 @@ env:
. .training_env/bin/activate
pip install --upgrade pip
pip install -r requirements/training/training-dev.txt && pre-commit install
echo " --- TO ACTIVATE THE ENVIRONMENT --- "
echo "Run source .training_env/bin/activate to activate the virtual environment"
.PHONY: check-all
check-all: pre-commit run -a
.PHONY: build
build:
docker-compose build

View file

@ -3,6 +3,13 @@
Starter Readme:
Steps for pipeline:
- (WIP) Set up the training development environment
- Change directory to this folder (simulation_system)
- Run the following command `make env PYTHON_VERSION=3.10.12`
- This will install the specified python version using `pyenv` and select this version as the global python version
- It will install all training packages as specified in the training-dev.txt requirements file, including the pre-commit hooks
- Run `source .training_env/bin/activate` to use this environment
- (WIP) Use Makefile to start up mock up s3 service
- By running `make init`, this will run the `docker-compose build` and `docker-compose up -d`, which spins up a S3 service
- This docker compose is running in detached mode `-d`, so will no output anything to the terminal
@ -27,7 +34,7 @@ Steps for pipeline:
- Once model build is finished, you can run the `prediction.py` file to generate prediction
- By default, the prediction pipeline will select the best model based on **mean absolute error** from the model registry
- This can be overwritten by specifying a model_path, which will load an alternative model
- This can be overwritten by specifying a model_path, which will load an alternative model
- There are two ways of getting data into the pipeline:
- Using the `--data` argument:
- This is a JSON string which can be passed as `python3 predictions.py --data '{"TOTAL_FLOOR_AREA": 1}'`
@ -55,12 +62,16 @@ Steps for pipeline:
- Add precommit hooks (linters, branch names, etc)
- Sphinx documentation
- Sort out local mock up services
- Sort out Model Registry
- Sort out Model Registry
- Sort out Data version control
- pre-commit hooks:
- The types of hooks that we want (safety, bandit, iso8 etc)
- The customisations we require
- Add sphinx documentation
- Data Science:
- Implement a metrics class, to hold all metric
- Implement a metrics class, to hold all metric
- Rebuild metrics script (Could be a one off but good to have)
- Determine metrics
- Determine metrics
- Implement and test custom model (Tensorflow Decision Trees etc)
- Orchestration:
- Lambda handler for the pipeline
- Lambda handler for the pipeline