Piotr Grabowski
10/04/2023, 3:30 PMnext()
what it needs. Now the problem is that when we pass that generator between the nodes, Kedro seems to be internally running `next()`on all the elements of that generator and loads everything into RAM, while spending a lot of time on it, like this:
[10/04/23 15:51:26] INFO Saving data to 'configured_graph_generator' (MemoryDataset)... data_catalog.py:531
[10/04/23 15:53:58] INFO Saving data to 'configured_graph_generator' (MemoryDataset)... data_catalog.py:531
[10/04/23 15:56:28] INFO Saving data to 'configured_graph_generator' (MemoryDataset)... data_catalog.py:531
[10/04/23 15:58:58] INFO Saving data to 'configured_graph_generator' (MemoryDataset)... data_catalog.py:531
[10/04/23 16:01:28] INFO Saving data to 'configured_graph_generator' (MemoryDataset)... data_catalog.py:531
[10/04/23 16:03:57] INFO Saving data to 'configured_graph_generator' (MemoryDataset)... data_catalog.py:531
[10/04/23 16:06:26] INFO Saving data to 'configured_graph_generator' (MemoryDataset)... data_catalog.py:531
[10/04/23 16:08:56] INFO Saving data to 'configured_graph_generator' (MemoryDataset)... data_catalog.py:531
Is there a way to avoid this? Is this a bug or a consequence if misusing Kedro? 🙂Nok Lam Chan
10/04/2023, 3:35 PM