Another `kedro mlflow` question I train multiple Transformer Kedro #plugins-integrations

Another `kedro-mlflow` question: I train multiple ...

Hugo Evers

07/12/2023, 4:35 PM

Another

kedro-mlflow

question: I train multiple Transformer models in a single pipeline, both

kedro_mlflow

and

Transformers

Trainer

are logging to mlflow. When i train a single Transformer, this is no problem, but when i train multiple transformers in the same pipeline with different settings, this becomes an issue as i get :

Copy code

RestException: INVALID_PARAMETER_VALUE: Changing param values is not allowed. Params were already logged='[{'key': 
'num_train_epochs', 'old_value': '3', 'new_value': '10'},

This is an example of where the issue pops up when i train a transformer on one dataset, and then tune it further on another. I can imagine this can be solved by making

kedro_mlflow

and

Trainer

aware of the fact that these are actually different models (since they live inside their own namespace), by appending some prefix to the param values. But at this point it becomes quire difficult to debug (since the logging code is nested very deep) whether this should be implemented for the

Trainer

Kedro-mlflow

or both. Any thoughts?

Yolan Honoré-Rougé

07/12/2023, 5:07 PM

The point is that you have 2 parameters with the exact same name so mlflow (and not kedro-mlflow) raises an error. If I understand correctly, this is because you have duplicated a pipeline with a différent namespace and kedro-mlflow should log tje parameter as "namespace.num_train_epoch" instead of "num_train_epoch", is it correct? Coule you provide some code to show what your pipeline and parameters.yaml look like ?

Hugo Evers

07/19/2023, 4:46 PM

yeah, the issue is with the mlflow logger in the transformers trainer, so basically i need to find a way to have that logger be aware of my kedro namespace

Yolan Honoré-Rougé

07/20/2023, 7:38 PM

Ok it really sounds like a bug. Can you provide a minimal exemple so that I can reproduce it easily?

Yolan Honoré-Rougé

07/20/2023, 7:39 PM

It seems to be only linked to parameters and namespacing, not the Trainer, do you confirm?

3 Views

Open in Slack

Previous Next