FlorianGD
02/14/2023, 4:32 PMpandas.ParquetDataSet
does not use pandas all the time? I would like to use it for partitioned data, and I want to use the filters
that pandas.read_parquet
provides, but it is not available for pyarrow.parquet.ParquetDataset.read
. Doing a quick test and using pd.read_parquet
every time seems to work ok, even though it does not behave exactly the same.datajoely
02/14/2023, 5:06 PMpandas.GenericDataSet
if you definitely want to use the pandas APIFlorianGD
02/14/2023, 5:20 PMGenericDataSet
, I was not aware of it!John Melendowski
02/14/2023, 10:58 PMFlorianGD
02/15/2023, 8:30 AMloard_args
consistent and not depend on whether we try to read a folder or a file