Hi Team! :kedro: I am unable to read partitioned ...
# questions
a
Hi Team! K I am unable to read partitioned parquet file from GCS using
pandas.ParquetDataset
Copy code
master_table@spark: 
   <<: *pq 
   filepath: "<gs://my_bucket/05_model_input/master_table>" 
  
 master_table@pandas: 
   type: pandas.ParquetDataset 
   filepath: "<gs://my_bucket/05_model_input/master_table>"
spark reading works but pandas reading throws an error
Copy code
FileNotFoundError: my_bucket/05_model_input/master_table/
The same thing works on local.
👀 1
r
Hi @Abhishek Bhatia, Thank you for raising the issue. Could you please let me know the Kedro, kedro-datasets and python version you are using.
a
Copy code
kedro                      0.19.6
kedro-datasets             3.0.1
python
3.10.6
r
Hi @Abhishek Bhatia, Could you check if you have similar issue - https://github.com/pandas-dev/pandas/issues/55230 The error message you shared is similar to what we have here. I also found this old cache issue with gcfs which might be worth looking into. I would also like to know if you tried this on S3 (or any other cloud provider) and facing similar issues. Thank you
a
That's correct. It's unusual that it happens on all object storage?