Hello, I am using kedro-mlflow and trying to na...
# questions
m
Hello, I am using kedro-mlflow and trying to namespace a pipeline at the same time to do a bunch of runs together. When trying to save a metric, if I use the namespace's names explicitly in the catalog, it works. i.e.:
model_1.mae:
type: <http://kedro_mlflow.io|kedro_mlflow.io>.metrics.MlflowMetricDataset
model_2.mae:
type: <http://kedro_mlflow.io|kedro_mlflow.io>.metrics.MlflowMetricDataset
if however i try and template the name in the catalog it fails:
"{model_name}.mae":
type: <http://kedro_mlflow.io|kedro_mlflow.io>.metrics.MlflowMetricDataset
I get the error message: DatasetError: Failed while saving data to dataset MlflowMetricDataset(run_id=...). Invalid value null for parameter 'name' supplied: Metric name cannot be None. A key name must be provided. do I just have to avoid templating in the catalog when it comes to mlflow related entries?
👀 2
d
looks like it's some problems with resolution, I will have a look, could you please test 2 options if you have time: 1)
"{model_name}.mae":
type: <http://kedro_mlflow.io|kedro_mlflow.io>.metrics.MlflowMetricDataset
name: "{model_name}.mae"
2) try with last version kedro 1.0.0rc1
m
Hi Dmitry, thanks for looking into this. I did test 1. I got the following error: DatasetError: MlflowMetricDataset.__init__() got an unexpected keyword argument 'name'. Dataset 'test.ml_model_mae' must only contain arguments valid for the constructor of 'kedro_mlflow.io.metrics.mlflow_metric_dataset.MlflowMetricDataset'. I'll let you know when I've done test 2.
👍 1
thankyou 1
I set up a new project with kedro 1.0.0rc1, but I get an error when I try "pip install kedro-mlflow" """""""""""""""""""""""""""""""" kedro-mlflow 0.9.0 depends on kedro0.19.0 and =0.18.0 To fix this you could try to: 1. loosen the range of package versions you've specified 2. remove package versions to allow pip attempt to solve the dependency conflict ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts So it looks like it's not compatible at the moment.
d
yes, it is, thanks for trying, sorry for disturbing, currently looking on it
e
Hi @*minesh,* which kedro version did you use?
m
i was using 0.19.12
d
I thought that likely we doesn't support usage namespaces with dataset factories, but they work well, the issue seems to be related to ml-flow, I continue looking on it
y
Sorry for the bug. Does it happen with other Mlflow Dataset like Mlflow ArtifactDataset or MlflowModelDataset? I suspect there is a special treatment for metrics dataset, because I need to use the name of the Dataset as the metric name ; as a consequence, a after_catalog_created hook modifies the Dataset inplace, and this likely happen before the factories are resolved. I need to find out a better solution.
👍 1
m
Hi Yolan, 1) there is no need to say sorry! 2) I have tested with kedro_mlflow.io.artifacts.MlflowArtifactDataset. And it worked without any issues. I managed to save two different graphs. 3) With regards to saving 2 models with MlflowModelTrackingDataset... the pipeline ran. However, when I open Mlflow I only see one model in the Artifacts section in the model folder. I am very new to mlflow so have no idea if this is a feature or a bug. When I run it again with the models explictily stated in the catalog, i.e. 2 different entries: "model1.regressor" and "model2.regressor", i get the same result - one model in the mlflow dashboard. Hope that helps!
👍 1