Elior Cohen
11/14/2023, 2:02 PMtext.TextDataSet
for this purpose, but this means that my node should return a string with the entire contents of this RDF file.. Because the file is going to be so big, I want to write it in batches, so create 10% of the contents, then write them, then create another 10% and write them and so on until getting to 100%.
Is there a way in Kedro to achieve something like that? I looked at IncrementalDataSet
but it seems it has nothing to do with this use caseDeepyaman Datta
11/14/2023, 2:05 PMPartitionedDataset
, and lazily save each partition.Elior Cohen
11/14/2023, 2:06 PMdatajoely
11/14/2023, 2:12 PMtext.TextDataSet
and get it to write chunks using a generatorElior Cohen
11/14/2023, 2:14 PM_save
would accept a generator instead of a string?datajoely
11/14/2023, 2:15 PMElior Cohen
11/14/2023, 2:17 PM_save
still should accept str
and kedro will unpack if I understand correctlymarrrcin
11/14/2023, 2:21 PM_save
for each item - your dataset implementation can just append to a single file.
Example: https://kedro-org.slack.com/archives/C03RKP2LW64/p1699443585183569?thread_ts=1699440250.364219&cid=C03RKP2LW64Elior Cohen
11/14/2023, 2:22 PM