Hi! I would like to be able to track certain param...
# questions
j
Hi! I would like to be able to track certain parameters of my pipeline with kedro experiment tracking. I want them always to be tracked with each run, even if I just run a subset of the complete pipeline. What would be the best approach to do this? If I understand correctly we need to run a node "track_parameters" that tracks those parameters and saves them to a metrics dataset every time. I thought about hooks but I don't know how to modify the pipeline in the before_pipeline_run hook so that the node "track_parameters" is added if not present yet.
👀 1
e
Hi Jan, your case is the common application of hooks. You can use them to inject additional behaviour at certain lifecycle points in Kedro’s main execution, so you do not need to create any additional nodes, and a pipeline doesn’t require any modifications. Please see the execution order here: https://docs.kedro.org/en/stable/hooks/introduction.html You can use
after_node_run
hook to log your metrics upon node/nodes execution as in the example: https://docs.kedro.org/en/stable/hooks/examples.html#add-metrics-tracking-to-your-model
j
Hi Elena, thanks for the hint. How exactly could I then create the tracked metric for kedro experiment tracking? I tried with kedro-mlflow and that works well for tracking the parameters but they are then tracked via mlflow. However, in mlflow I can not directly compare artifacts (i.e. plots). Thus, I would like to use kedro viz to compare two runs directly, showing the plots and the used parameters. What I am not sure about is how to log the metrics (to kedro, not to mlflow) to the corresponding current run during the hook execution?
e
To enable experiment tracking with Kedro-Viz you should: • Set up a session store to capture experiment metadata • Set up experiment tracking datasets to list the metrics to track • Modify your nodes and pipelines to output those metrics Here is the description of the above steps with an eaxample: https://docs.kedro.org/projects/kedro-viz/en/stable/experiment_tracking.html#when-should-i-use-experiment-tracking-in-kedro
j
Thanks, I did setup the experiment tracking with kedro viz already. The culprit for me is to find how to track the parameters as a metric via a hook. If I just create a node that tracks the parameters it is not guaranteed that this one will run each time I run a certain pipeline.
e
You can then move experiment tracking nodes into a separate pipeline and then run it with the subset of the target pipeline, aka
kedro run experiment_traking_pipleline + target_subset
Edit: the syntax above is not possible - it’s just to give an idea of splitting the pipeline. But you can sum pipeline objects, see an example below.
Otherwise, you probably will need to modify the data catalog in the
after_node_run
hook to save your metrics rather than make it in a separate node
j
Thanks, would the first option still be run if I use a filter like
--from-nodes
? For the second option I don't think this is possible. Or will modifications to the catalog object be reflected in the rest of the execution? Because the return type of the function is None?
e
1. You can use
--from-nodes
in case your metrics tracking nodes remain in the pipeline after the slice. You do not need to split your pipeline for this. 2. The solution that I mean is to split your pipeline into two:
Copy code
custom_pipeline = (
    experiment_traking_pipleline() + main_pipeline()
)
To further filter nodes you can apply tags: https://docs.kedro.org/en/stable/nodes_and_pipelines/nodes.html#how-to-tag-a-node So you run
kedro run -p custom_pipeline -t tracking, tag_a
Catalog modification at runtime is possible but it’s not straightforward and we do not recommend this, here is what you will need to do in case you decide to follow this option: https://kedro-org.slack.com/archives/C03RKP2LW64/p1719401040476289?thread_ts=1719389062.377319&cid=C03RKP2LW64
j
Alright, understood. Used a tag indeed but the node was thrown out after the
--from-nodes
filter. Putting that one in another pipeline will probably fix that indeed. Thanks a lot for the support 🙂
🚀 1
👍 1