Kedro is an open-sourced Python framework for creating maintainable and modular data science code.

Kedro

Hi Team! :kedro:

I am unable to read partitioned parquet file from GCS using `pandas.ParquetDataset`

``` master_table@spark: 
   <<: *pq 
   filepath: "<gs://my_bucket/05_model_input/master_table>" 
  
 master_table@pandas: 
   type: pandas.ParquetDataset 
   filepath: "<gs://my_bucket/05_model_input/master_table>" ```
spark reading works but pandas reading throws an error

```FileNotFoundError: my_bucket/05_model_input/master_table/```
The same thing works on local.

Hi <@U049356V09F>, Thank you for raising the issue. Could you please let me know the Kedro, kedro-datasets and python version you are using.

```kedro                      0.19.6
kedro-datasets             3.0.1```
python `3.10.6`

Hi <@U049356V09F>, Could you check if you have similar issue - <https://github.com/pandas-dev/pandas/issues/55230>

The error message you shared is similar to what we have <https://github.com/pandas-dev/pandas/issues/55230#issuecomment-1733650712|here>.
I also found this old cache <https://github.com/fsspec/gcsfs/issues/184|issue> with gcfs which might be worth looking into.

I would also like to know if you tried this on S3 (or any other cloud provider) and facing similar issues. Thank you

That's correct. It's unusual that it happens on all object storage?