Following up on my previous question talking with Rashida I Kedro #questions

Following up on my previous question, talking with...

Thiago Valejo

08/12/2025, 3:14 PM

Following up on my previous question, talking with Rashida I found that the problem is a little bit different, so I'm reposting: I’m failing to load the champion version of a wrapper SklearnPipeline model registered in MLFlow. I want to save many experiments to MLFlow and to be able to load the champion version for other downstream pipelines. My catalog.yml looks like this:

Copy code

model:
 type: kedro_mlflow.io.models.MlflowModelTrackingDataset
 flavor: mlflow.sklearn
 save_args:
  registered_model_name:model

model_loader:
 type: kedro_mlflow.io.models.MlflowModelRegistryDataset
 flavor: mlflow.sklearn
 model_name: "model"
 alias: "champion"

If I try to load the model in a new kedro session, it will demand a run_id. If I try to use the model_loader. It will complain that the model (the wrapper SklearnPipeline object) don’t have a metadata attribute, giving this error message:

Copy code

│ /opt/anaconda3/envs/topazDS_2/lib/python3.11/site-packages/kedro_mlflow/io/models/mlflow_model_r │
│ egistry_dataset.py:98 in _load                                                                   │
│                                                                                                  │
│    95 │   │   # because the same run can be registered under several different names             │
│    96 │   │   #  in the registry. See <https://github.com/Galileo-Galilei/kedro-mlflow/issues/5>   │
│    97 │   │   import pdb; pdb.set_trace()                                                        │
│ ❱  98 │   │   <http://self._logger.info|self._logger.info>(f"Loading model from run_id='{model.metadata.run_id}'")          │
│    99 │   │   return model                                                                       │
│   100 │                                                                                          │
│   101 │   def _save(self, model: Any) -> None:                                                   │
╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
AttributeError: 'SklearnPipeline' object has no attribute 'metadata'

DatasetError: Failed while loading data from dataset 
kedro_mlflow.io.models.mlflow_model_registry_dataset.MlflowModelRegistryDataset(model_uri='models:/mill1_west_no_we
nco_st_model@champion', model_name='mill1_west_no_wenco_st_model', alias='champion', flavor='mlflow.sklearn', 
pyfunc_workflow='python_model').
'SklearnPipeline' object has no attribute 'metadata'

I think that the MlflowModelRegistryDataset class wasn't expecting the model to be a sklearn object. Probably there's a difference in how I'm saving the model (MlflowModelTrackingDataset) and how I'm loading it (MlflowModelRegistryDataset). How I could load the champion model? @Rashida Kanchwala @Ravi Kumar Pilla

👀 1

Ravi Kumar Pilla

08/12/2025, 3:18 PM

As discussed, I will look at this and get back to you. In the meantime, @Yolan Honoré-Rougé if you have any suggestions, please let us know. Thank you

Ravi Kumar Pilla

08/12/2025, 4:43 PM

Hi @Thiago Valejo, The whole issue comes from the model being saved not having the metadata field. If I comment out the logger present in

MlflowRegistryDataset

it works fine. This was introduced in 0.13.3 release. I am not sure if there is a schema for the model which is saved using MlflowModelTrackingDataset

Copy code

<http://self._logger.info|self._logger.info>(f"Loading model from run_id='{model.metadata.run_id}'")

👍 1

Ravi Kumar Pilla

08/12/2025, 4:45 PM

@Yolan Honoré-Rougé thank you for responding. Is there a schema expeced from the model ? I think @Thiago Valejo is using a custom model

Yolan Honoré-Rougé

08/12/2025, 4:47 PM

Can you open an issue on GitHub to keep track? I'll publish a fix in 2 weeks, but if someone can open a PR I can review and release. The simplest short term solution is to comment out logging, but I'll dig deeper when I have time

Ravi Kumar Pilla

08/12/2025, 4:48 PM

okay, I can open an issue and also a short term fix PR conditionally logging if there is no schema constraint for the model

👍 1

Ravi Kumar Pilla

08/12/2025, 4:52 PM

Hi @Thiago Valejo, which python version are you using ?

Thiago Valejo

08/12/2025, 4:53 PM

3.11

Ravi Kumar Pilla

08/12/2025, 4:53 PM

from the logs looks like py311, can you install 0.13.2

Ravi Kumar Pilla

08/12/2025, 4:53 PM

kedro-mlflow

Ravi Kumar Pilla

08/12/2025, 4:53 PM

as a short term workaround and try testing ?

Ravi Kumar Pilla

08/12/2025, 7:46 PM

For tracking - PR - https://github.com/Galileo-Galilei/kedro-mlflow/pull/671 Issue - https://github.com/Galileo-Galilei/kedro-mlflow/issues/670 @Yolan Honoré-Rougé whenever you have time. Thank you

🥳 1

Yolan Honoré-Rougé

08/12/2025, 7:56 PM

Thank you very much for the PR @Ravi Kumar Pilla . Can you upgrade to kedro 1.0 @Thiago Valejo? The fix will only be available for kedro-mlflow>=1.0.0 which is only compatible with kedro>=1.0 unfortunately :/

extreme teamwork 1

Ravi Kumar Pilla

08/12/2025, 7:57 PM

He is on Kedro 1.0 when we spoke today

👍 1

Yolan Honoré-Rougé

08/13/2025, 8:34 PM

The fix is on pypi, you can upgrade: https://pypi.org/project/kedro-mlflow/

🥳 1

👍 1

Ravi Kumar Pilla

08/13/2025, 9:16 PM