Hi all, I have several datasets of daily partitioned data in delta tables and want to process the data for a given day. E.g. I want to run my pipeline for 2024-08-01, use the respective partitions from the datasets in the raw layer and create new partitions for this day in all other layers. Any advice on how to do that?
j
Jitendra Gundaniya
09/03/2024, 9:05 AM
Hi Paul,
Thank you for your question.
I will look into it.
Jitendra Gundaniya
09/03/2024, 3:30 PM
Do you already have a pipeline that expect one partition? i.e. the node expect a single partition, or a batch?
Have you considered the
PartitionedDataset
or
IncrementalDataset
?
p
Paul Weiss
09/04/2024, 8:04 AM
IncrementalDataset might provide what I need, thanks