Deepyaman Datta10/22/2023, 1:32 PM
users out there! We have a question for you, related to enabling versioning for
--which of the below options makes the most sense to you? 1. https://github.com/kedro-org/kedro/pull/521 proposes to enable versioning of the underlying dataset, by specifying
in the dataset config:
On the plus side, having the
station_data: type: PartitionedDataset path: data/03_primary/station_data dataset: type: pandas.CSVDataset versioned: true
config on the
config makes it clear that the versioning is applied to the underlying dataset, not to the
. However, there are some edge cases (see https://github.com/kedro-org/kedro/pull/521#issuecomment-744653023, if you're keen). 2. Alternatively, we can move the
flag to the top level
Note that the versioning is still of the underlying dataset (e.g.
station_data: type: PartitionedDataset path: data/03_primary/station_data versioned: true dataset: type: pandas.CSVDataset
), even though the config is at the top level. 3. None of these options make sense; what you really need is versioning of the top-level dataset. (Note that we don't have a solution designed for this case, but it would be great to know nonetheless!) Please feel free to vote using 1️⃣2️⃣3️⃣, and elaborate further on your thoughts in the thread below!