Hi team under `DataCatalog`, I would like to `pand...
# questions
d
Hi team under
DataCatalog
, I would like to
pandas.ParquetDataset
to partition by the date in the dataset and save into different folders by date in parquet like how we can do it for
spark.SparkDataSet
. Is there a way we could partition using pandas?
n
When using
ParquetDataSet
, I advise using whatever Parquet offer because it’s a native implementation and often you get better predicate pushdown for performance. Regarding to pandas, any Dataset that not offer partitioning can be partitioned with
PartitionedDataSet
https://docs.kedro.org/en/stable/kedro.io.PartitionedDataset.html#kedro.io.PartitionedDataset