question about kedro mlflow regarding the use of `MlflowMode Kedro #plugins-integrations

question about kedro-mlflow, regarding the use of ...

Hugo Evers

11/27/2023, 2:15 PM

question about kedro-mlflow, regarding the use of

MlflowModelRegistryDataSet

in the Kedro-Mlflow integration for logging models to MLflow’s model registry, as documented in the Kedro-MLflow Python Objects section. Initially, I followed the documentation for using

MlflowModelLoggerDataSet

in the

catalog.yml

file, which I implemented successfully. However, I encountered confusion with

MlflowModelRegistryDataSet

. My initial attempt was based on the following configuration:

Copy code

my_transformer_model:
  type: kedro_mlflow.io.models.MlflowModelRegistryDataSet
  flavor: mlflow.transformers
  model_name: my_transformer_model_name
  stage_or_version: staging

When trying to save a model using

catalog.save("my_transformer_model", model)

, I received a

DatasetError

indicating that the ‘save’ method is not implemented for

MlflowModelRegistryDataSet

. The documentation provides parameters for this dataset but lacks a clear example for its correct usage in saving and registering a model to MLflow. Moving forward, I found a working solution for logging the transformer model in YAML API:

Copy code

my_transformer_model:
    type: kedro_mlflow.io.models.MlflowModelLoggerDataSet
    flavor: mlflow.transformers
    save_args:
        registered_model_name: "my_transformer_model_name"

This allowed me to save and load the model to MLflow successfully. This however is not documented as such. For model loading, I could indeed use the initial catalog entry for loading specific versions directly, Yet, I still have unresolved queries w.r.t Model Staging/Versioning*:* How to stage or version the model directly through the API, instead of using the MLflow UI. so using the MlflowModelLoggerDataSet to save, but also specify a version/stage. In addition i was wondering how to view associated metrics with the model training run in the mlflow model UI to efficiently promote the best model to staging. I can imagine that including practical examples in the official documentation, would significantly enhance the user experience.

Yolan Honoré-Rougé

11/28/2023, 8:04 AM

Hello, I was indeed going to suggest to go with

MlflowModelLoggerDataSet

but I understand it is confusing. The rationale here is that model registry dataset aims at transitioning an existing model to the registry. Someone already mentioned that I should better document it, or even consider merging the 2 datasets.

Yolan Honoré-Rougé

11/28/2023, 8:05 AM

Would you mind opening an issue in the kedro-mlflow repo?

Hugo Evers

11/28/2023, 2:39 PM

yes, ill open an issue, thanks!

16 Views

Open in Slack

Previous Next