Hi everyone, I need some help understanding how t...
# questions
h
Hi everyone, I need some help understanding how to define filters in load_args when loading a ParquetDataset with Dask from the catalog. My catalog entry would be something like:
Copy code
data:
  type: dask.ParquetDataset
  filepath: data/
  load_args :
    filters: [('filter_1', '==', 1) or
                ('filter_2', '==', 1) or
                ('filter_3', '==', 1) or
                ('filter_4', '==', 1) ]
I tested this exact syntax for filters in the Python API and while it works there, I cannot seem to find a way to make it work using the catalog, since it raises the error:
Copy code
kedro.io.core.DatasetError: Failed while loading data from data set 
An error occurred while calling the read_parquet method registered to the pandas backend.
Original Message: too many values to unpack (expected 3)
n
Is this actual code or a string literal filter?
If you need some exotic way to run python code, i.e. Python datatype for Polars , you may want to check out
resolver
in docs.kedro.org, where you can provide custom expression
👍 1
1