mirror of
https://github.com/Hestia-Homes/ML.git
synced 2026-06-08 11:17:25 +00:00
42 lines
1.9 KiB
Markdown
42 lines
1.9 KiB
Markdown
# ML Toolkit
|
|
|
|
Creating a ML-toolkit that can be reused:
|
|
|
|
- ML pipeline:
|
|
- A generic pipeline that has data version control, experiment
|
|
tracking and a model registry
|
|
|
|
- ML monitoring:
|
|
- A bolt-on service that can implement model monitoring
|
|
|
|
There are multiple protected branches which adapt the generic pipeline to produce different models:
|
|
- sap_change-**
|
|
- heat_change-**
|
|
- carbon_change-**
|
|
|
|
These branches will differ by the configuration files that define the data used and the outputs of the ML-pipeline
|
|
- There can be different additional logic for each branch but the pipeline will be the same.
|
|
|
|
# Deployment
|
|
|
|
Scripts associated to deployment can be found in the deployment/ folder.
|
|
|
|
Deployment is automated via Github Actions, where a deployment is triggered by a push to one of the
|
|
protected branch, with one of dev or prod as the suffix, describing the target environment.
|
|
|
|
The github actions file will build and push a docker image to ECR and then deploy a lambda
|
|
which produces predictions for the relevant model.
|
|
|
|
In order for this to be set up, some key environment variables needs to be inserted into Github
|
|
secrets. Each different model and protected branch has its own set of secrets which allows for flexibility
|
|
between different pipelines.
|
|
|
|
For example, for the branch sap_change-dev, the prefix=SAP_CHANGE_DEV, and the following secrets are:
|
|
|
|
- {prefix}_ECR_URI, which is the URI of the ECR repository to push to. For example, for the
|
|
sap change model this is the lambda-sap-prediction-dev repository.
|
|
- {prefix}_DOMAIN_NAME, is the custom domain name. This is likely going to be the same across the different
|
|
models, but is still included in the secrets for flexibility.
|
|
- {prefix}_DATA_BUCKET, is the name of the s3 data bucket where data to be scored by the model is stored
|
|
- {prefix}_MODEL_BUCKET, is the name of the s3 bucket where the model is stored
|
|
- {prefix}_PREDICTIONS_BUCKET, is the name of the s3 bucket where the predictions are stored
|