Hey team two questions on `kedro mlflow` 1 Is there a way to Kedro #questions

Hey team, two questions on `kedro-mlflow`: 1. Is t...

Daniel Kirel

08/03/2023, 8:25 PM

Hey team, two questions on `kedro-mlflow`: 1. Is there a way to log git commit tag/sha through

kedro-mlflow

? 2. Is there a good way to save input datasets without needing to create separate MLFlow artifact datasets and a node to read and save datasets? Appreciate any help/guidance on this 🙏

Merel

08/04/2023, 12:52 PM

@Yolan Honoré-Rougé could you help here?

Merel

08/04/2023, 12:53 PM

@Daniel Kirel You could also have a look at #C03RKPCLYGY where there’s some more activity and questions around

kedro-mlflow

marrrcin

08/04/2023, 1:52 PM

1. You can use a hook for that to have it in a “Kedro way of doing things” or you can just create a node that will do

mlflow.log_param("git_sha", <value of git sha>)

. Usual place to do this is

before_pipeline_run

after_pipeline_run

. 2. Again hooks, assuming that you don’t want to read and serialize the data just for the sake of logging it as an artifact to mlflow. The problem with that is it’s not a really “Kedro way of doing things”, because you would have to access

_filepath

(or similar) property of the dataset object, which is “private”. You can use

before_node_run

for that https://docs.kedro.org/en/stable/kedro.framework.hooks.specs.NodeSpecs.html#kedro.framework.hooks.specs.NodeSpecs.before_node_run which has access to node, catalog and inputs in one place.

👍 1

Daniel Kirel

08/04/2023, 2:56 PM

Thank you! Very helpful

9 Views

Open in Slack

Previous Next