Hi I have been happily using Kedro for a Data Science projec Kedro #questions

Hi, I have been happily using Kedro for a Data Sci...

Pietro Peterlongo

03/14/2025, 6:41 AM

Hi, I have been happily using Kedro for a Data Science project (still in the PoC phase and working locally). I was wondering what would be the easiest way to do some lightweight experiment tracking (being able to save artifacts and parameters of a specific pipeline and not overwriting them, so that I can compare results from different runs). As far as I understand using versione=true in catalog would not be helpful since it does not come with a way to track metadata (although I could add a node to materialize them) and I do not think there is an easy way to load specific versions or all timestamps of a version (could not find the catalog.load api ref in te docs). I am aware there is a mlflow plugin but it might be too overkill for my use case (also my model would be a custom one and should require additional effort to setup). I was thinking of maybe just create a new configuration for each new experiment overriding relevant parameters and copying and pasting the relevant parts of catalog changing path names of files I want replicated. Is this something that people do or are there other options? Thanks and have a great day!

Hall

03/14/2025, 6:41 AM

Someone will reply to you shortly. In the meantime, this might help:

Pietro Peterlongo

03/14/2025, 6:45 AM

Ok thanks to the Ask AI button I was sent to the experiment tracking page which might do what I want (I kind of new it existed but somehow did not take into account), I will be reading this up and see if I can mark this solved already https://docs.kedro.org/en/0.18.11/experiment_tracking/index.html

Pietro Peterlongo

03/14/2025, 10:24 AM

https://docs.kedro.org/en/stable/integrations/mlflow.html

Yolan Honoré-Rougé

03/14/2025, 10:25 AM

Just for the record, the kedro-mlflow plugin is likely easier to setup (if that's the blocking point ; I understand the desire to not be tied to mlflow) : just

pip install kedro-mlflow

will track a lot of stuff automatically with no configuration required. You can even try it and if you are not satisfied just uninstall it and remove the

mlruns/

folder at the root of your project

👍 1

thankyou 1

💡 1

Merel

03/14/2025, 12:29 PM

I echo what Yolan is saying and would also recommend using

kedro-mlflow

. The AI bot has unfortunately sent you to an old docs page and native experiment tracking has been removed in later versions of Kedro Viz.

thankyou 1

Pietro Peterlongo

03/14/2025, 12:40 PM

Thanks for the feedback @Merel and @Yolan Honoré-Rougé, now I remember why I overlooked the native experiment stuff, it was deprecated :) I will also give a better look and try kedro-mlflow

👍 1

Yolan Honoré-Rougé

03/14/2025, 12:47 PM

(the official doc seems very slightly outdated : you don't have to specify the artifact Uri in the mlflow.yml, but you can , often used when people have a remote server in an enterprise setup; if you want to create the yml file, use

kedro mlflow init --env <your-env>

instead of doing it manually)

👍 1

Rashida Kanchwala

03/14/2025, 1:18 PM

We also have a youtube video on Kedro Experiment Tracking and MLflow - so you can have a look

https://www.youtube.com/watch?v=Az_6UKqbznw&t=967s▾

👍 1

🙏 1

Pietro Peterlongo

03/21/2025, 2:04 PM

As a follow we happily proceeded with MLFlow and the video was helpful. Thanks for your feedback! (Edited a message that was put in this thread by mistake, the link somehow stayed…)

3 Views

Open in Slack

Previous Next