Sorry for the spam I am having issues with the po...
# questions
Sorry for the spam I am having issues with the polars.Genericdataset, was gonna open an issue but thought I'd open it up here first in case I'm being dumb. Currently I load azure blob parquet files like this:
Copy code
    storage_options = {
        "account_name": os.environ["AZURE_STORAGE_ACCOUNT_DATA_NAME"],
        "anon": False
the following:
Copy code
  account_name: ${oc.env:AZURE_STORAGE_ACCOUNT_DATA_NAME}
  anon: false
Copy code
  type: polars.GenericDataset
  file_format: parquet
  filepath: az://${oc.env:CONTAINER_NAME_ENV_KEY}/data/blabla.parquet
  credentials: azure_blob
results in the following error when I try to load
Copy code
--> 153 with, **self._fs_open_args_load) as fs_file:
    154     return load_method(fs_file, **self._load_args)

File ~/.venv/lib/python3.10/site-packages/fsspec/, in, path, mode, block_size, cache_options, compression, **kwargs)
   1240 ac = kwargs.pop("autocommit", not self._intrans)
-> 1241 f = self._open(
   1242     path,
   1243     mode=mode,
   1244     block_size=block_size,
   1245     autocommit=ac,
   1246     cache_options=cache_options,
    201     )
--> 202     raise DatasetError(message) from exc

DatasetError: Failed while loading data from data set GenericDataset(file_format=parquet, filepath=/data/blablabla.parquet, load_args={}, protocol=az, save_args={}).
[Errno 2] No such file or directory: 'data/blablabla.parquet'
(Please ignore any inconsistencies container names, filenames etc, I tried to remove some information when pasting into slack but I probably wasn't super thorough) The container name is being stripped from the filepath which I assume is being supplied to fsspec somewhere else, but I'm not entirely sure why the load is failing when the pure polars call is working. I know polars recently did away with fsspec and implemented their own native support for cloud ( but I'm not sure if it has anything to do with that.
I have verified that
is being correctly interpolated to give the proper filepath
Ok changing
fixes it, but
works with polars, and should work with fsspec too through
@Juan Luis id appreciate if u can have a look at this when u have a chance, specifically the change on polars’ side which has done away with the need for fsspec stuff. It should simplify the polars dataset implementation. Lmk if this would be better suited as an issue :)
There’s no blocker from my side, but I thought it was a cool addition in polars, as it expands all the capabilities around filter push downs and stuff which before wasn’t available thru fsspec