Hello Kedro community,
I'm using Kedro with training a Pytorch model using PyTorch_lightning.
I know that there is a plugin (Kedro_azureml) that allows distributed training. As I'm currently not working within azurem, this is not an option for me (correct me if I'm wrong).
My problem occurs when trying to use the from torch.utils.data import DataLoader with num_workers > 0 as Kedro seems to somehow block this process (pl.Trainer.fit() freezes).
Is there a way to use distributed training without computing on a azure machine?
@marrrcin Maybe you can help me? 🙂
10/21/2023, 6:25 PM
Yes there are, I would love to help you but currently I'm off until 2023-11-02. Maybe someone else will chip in or you can wait until then
10/22/2023, 8:45 AM
Enjoy your time off! If no-one else is able to help, I will contact you again.
11/02/2023, 10:03 AM
OK, I’m back - do you still need assistance with that @Gregor Höhne?