https://kedro.org/ logo
#plugins-integrations
Title
# plugins-integrations
h

Hugo Evers

12/11/2023, 3:44 PM
question about
kedro-mlflow
, i run my training pipeline on aws batch where every node is executed separately as a run of a docker container, ive managed enforce that these are part of the same mlflow run by using the templatedconfigloader and a parameter in the mlflow.yml, the mlflow run id is passed as a runtime param to the container command. To illustrate, the container command is
kedro run --node=…. --params:mlflow_run_id:jhdsfkjhsdkfjhskjfh
. This works fine in its own, however every run of a node overwrite the contents of the mlflow experiment instead of appending to it. So basically only the information from the last run is now stored. so if i train two models in one run, (one is pretraining, the second is downstream finetuning for a specific clients task) i would still like to able to access the pretraining artefacts and metrics. But that is being overwritten. Is there some setting in mlflow which allows me to prevent overwriting the contents of mlflow runs, and instead append to it?
basically, these every node execution is a nested run, where the parent process is the awsbatchrunner. so maybe that is a way to ensure appending?