Hey team, two questions on `kedro-mlflow`: 1. Is t...
# questions
d
Hey team, two questions on `kedro-mlflow`: 1. Is there a way to log git commit tag/sha through
kedro-mlflow
? 2. Is there a good way to save input datasets without needing to create separate MLFlow artifact datasets and a node to read and save datasets? Appreciate any help/guidance on this 🙏
m
@Yolan Honoré-Rougé could you help here?
@Daniel Kirel You could also have a look at #plugins-integrations where there’s some more activity and questions around
kedro-mlflow
m
1. You can use a hook for that to have it in a “Kedro way of doing things” or you can just create a node that will do
mlflow.log_param("git_sha", <value of git sha>)
. Usual place to do this is
before_pipeline_run
or
after_pipeline_run
. 2. Again hooks, assuming that you don’t want to read and serialize the data just for the sake of logging it as an artifact to mlflow. The problem with that is it’s not a really “Kedro way of doing things”, because you would have to access
_filepath
(or similar) property of the dataset object, which is “private”. You can use
before_node_run
for that https://docs.kedro.org/en/stable/kedro.framework.hooks.specs.NodeSpecs.html#kedro.framework.hooks.specs.NodeSpecs.before_node_run which has access to node, catalog and inputs in one place.
đź‘Ť 1
d
Thank you! Very helpful