https://kedro.org/ logo
#plugins-integrations
Title
# plugins-integrations
h

hamiddimyati

03/28/2024, 10:59 AM
[SOLVED] Hi Folks Any ideas how to log metrics during neural networks (NN) training following
kedro-mlflow
concept? In my case, I use
pytorch lightning
, and they provided automatic logging using
mlflow
. Using this logger, the metrics are automatically saved into a new folder. I'm wondering how to save this training metrics as artifact of experiment that can also be tracked in
Kedro
catalog. Please help :)
m

marrrcin

03/28/2024, 1:01 PM
It's possible. In lightning you have
Copy code
from lightning.pytorch import Trainer
from lightning.pytorch.loggers import MLFlowLogger

mlf_logger = MLFlowLogger(experiment_name="lightning_logs", tracking_uri="file:./ml-runs")
trainer = Trainer(logger=mlf_logger)
So you need to bind the lightning logger to the current mlflow configuration. It's just a matter of getting the
run_id
from the current kedro-mlflow run:
Copy code
def get_mlflow_run_id():
    run_id = None
    if (ar := mlflow.active_run()) and ar.info.run_id:
        run_id = ar.info.run_id
    return run_id
And similarly for other parameters. You can also use Kedro hooks to extract
experiment_name
/
tracking_uri
from `kedro-mlflow`'s config, as it attaches itself to Kedro Context like
context.mlflow
h

hamiddimyati

03/28/2024, 1:26 PM
Hi @marrrcin thanks for your help. I will try it :)
y

Yolan Honoré-Rougé

03/29/2024, 7:42 AM
Honestly it is weird that it does not work out of the box. Kedro mlflow set most of the config globally (with
mlflow.set_tracking_uri
and
mlflow.set_experiment
so the
MlflowLogger
of pytorch lightning should default to this. If it doesn't, it means that there is some configuration override happening under the hood, so we should check the source code
Of course you can use @marrrcin solution but you will need to handle a bunch of things manually with a risk of error and maintenance issues, but this is the best way for now I guess
m

marrrcin

03/29/2024, 8:05 AM
The behaviour is simple there - if run_id is not set, it creates new run. https://lightning.ai/docs/pytorch/stable/_modules/lightning/pytorch/loggers/mlflow.html#MLFlowLogger
I worked with this setup many times 😉
h

hamiddimyati

04/01/2024, 7:33 AM
Hi @marrrcin need clarification from your suggestion. Extracting
experiment_name
and
tracking_uri
using Kedro hooks means should use I
after_context_created
? Then, how to bring those info into the training node? What I think is to add them into catalog, which I can access those two as inputs in a training node. Is it a good approach or there is a better way? thanks
Thanks for the helps. Now my issue is solved 🙂
m

marrrcin

04/02/2024, 7:20 AM
I would rather go with a single hook class that implements both
after_context_created
and
before_node_run
- first one captures reference to the Kedro Context and the second one injects `experiment_name`/`tracking_uri` etc into the node. Example (of this idea - not exactly your case): https://linen-slack.kedro.org/t/10362890/hi-all-another-question-slightly-smiling-face-i-am-creating-#ce99c8ab-3307-4932-9ff8-1e416e7f48ac
👍 1
2 Views