Hello slightly smiling face I had 2 kedro mlflow questions 1 Kedro #plugins-integrations

Hello :slightly_smiling_face: I had 2 kedro-mlflow...

Laure Vancau

06/05/2024, 12:51 PM

Hello 🙂 I had 2 kedro-mlflow questions : 1. Is it possible to log YamlDatasets as artifacts in kedro-mlflow? 2. How do you recommend changing the

run id

dynamically (set its value to a timestamp) ? in the current mlflow.yaml, if the run id is null, a new run is created with a random id. If it has a value (environment variable for example), then kedro mlflow searches for an existing run. We want to create a new run with a unique id of our own: what is the best practice to do so? Thanks a bunch ! ☀️

👀 1

marrrcin

06/05/2024, 1:20 PM

1. See https://kedro-mlflow.readthedocs.io/en/stable/source/04_experimentation_tracking/03_version_datasets.html#how-to-version-data-in-a-kedro-project Example:

Copy code

my_dataset_to_version:
    type: kedro_mlflow.io.artifacts.MlflowArtifactDataset
    dataset:
        type: yaml.YAMLDataset  # or any valid kedro Dataset
        filepath: /path/to/a/LOCAL/destination/file.yaml # must be a local file, wherever you want to log the data in the end

2. The unique identifiers are managed by MLflow itself, you can have influence on when the generation happens, but I'm pretty sure you cannot influence the run id format because it's created by the MLflow Server / API.

👍 1

Laure Vancau

06/05/2024, 1:24 PM

Hi, thanks a bunch @marrrcin for the very quick response ! 2. Works, I thought it was possible to set custom run_ids but I guess not 😊 1. I have exactly this structure but my yamls don't appear in the artifacts (and i get no errors)

Laure Vancau

06/05/2024, 1:26 PM

(we are using kedro-mlflow==0.11.9)

marrrcin

06/05/2024, 1:46 PM

Are they appearing in the specified path locally?

Laure Vancau

06/05/2024, 2:03 PM

yes, they are there 🙂

Yolan Honoré-Rougé

06/05/2024, 5:17 PM

Yes marrcin syntax is correct, but keep in mind it only works for output artifact. For those which are inputs of a node, there is no out of the box solution, but it is something often required

👍 1

☀️ 1

Yolan Honoré-Rougé

06/05/2024, 5:21 PM

For the run I'd, I don't think it's possible in mlflow + mlflow already has a timestamp. You can customize the run name though

👍 1

Laure Vancau

06/06/2024, 7:42 AM

ah ok ! so for it to be logged it needs to be the output of a node! that explains it : all our yamls are inputs

13 Views

Open in Slack

Previous Next