Gregor Höhne
11/10/2023, 8:03 AMdistributed
manner (multiple GPUs) using the ThreadRunner
, how can I save the metrics as versioned kedro_datasets.tracking.MetricsDataset
and some graphs with matplotlib.MatplotlibWriter
in just one file (something like sync_dist=True
) without creating a versioned file for each sub-process?marrrcin
11/10/2023, 10:37 AMkedro-azureml
- our solution was to save only on master node:
https://github.com/getindata/kedro-azureml/blob/8e5979f5040e03032215e9db25af51538ec6a26a/kedro_azureml/datasets/runner_dataset.py#L82Gregor Höhne
11/12/2023, 3:20 PMmarrrcin
11/13/2023, 7:49 AM.save
on the dataset. If you’ve added the guardrails (is_distributed_master_node
) then the warning can be probably neglected.