Hugo Evers
11/27/2023, 2:15 PMMlflowModelRegistryDataSet
in the Kedro-Mlflow integration for logging models to MLflow’s model registry, as documented in the Kedro-MLflow Python Objects section.
Initially, I followed the documentation for using MlflowModelLoggerDataSet
in the catalog.yml
file, which I implemented successfully. However, I encountered confusion with MlflowModelRegistryDataSet
. My initial attempt was based on the following configuration:
my_transformer_model:
type: kedro_mlflow.io.models.MlflowModelRegistryDataSet
flavor: mlflow.transformers
model_name: my_transformer_model_name
stage_or_version: staging
When trying to save a model using catalog.save("my_transformer_model", model)
, I received a DatasetError
indicating that the ‘save’ method is not implemented for MlflowModelRegistryDataSet
. The documentation provides parameters for this dataset but lacks a clear example for its correct usage in saving and registering a model to MLflow.
Moving forward, I found a working solution for logging the transformer model in YAML API:
my_transformer_model:
type: kedro_mlflow.io.models.MlflowModelLoggerDataSet
flavor: mlflow.transformers
save_args:
registered_model_name: "my_transformer_model_name"
This allowed me to save and load the model to MLflow successfully. This however is not documented as such. For model loading, I could indeed use the initial catalog entry for loading specific versions directly, Yet, I still have unresolved queries w.r.t Model Staging/Versioning*:* How to stage or version the model directly through the API, instead of using the MLflow UI. so using the MlflowModelLoggerDataSet to save, but also specify a version/stage.
In addition i was wondering how to view associated metrics with the model training run in the mlflow model UI to efficiently promote the best model to staging.
I can imagine that including practical examples in the official documentation, would significantly enhance the user experience.Yolan Honoré-Rougé
11/28/2023, 8:04 AMMlflowModelLoggerDataSet
but I understand it is confusing. The rationale here is that model registry dataset aims at transitioning an existing model to the registry. Someone already mentioned that I should better document it, or even consider merging the 2 datasets.Yolan Honoré-Rougé
11/28/2023, 8:05 AMHugo Evers
11/28/2023, 2:39 PM