https://kedro.org/ logo
#questions
Title
# questions
e

Eluard Camota

11/26/2023, 12:09 PM
Hi everyone, is there a way to chache an SQLQueryDataSet so it does not always takes time to fetch the same data everytime the pipeline runs? Thanks in advance.
m

marrrcin

11/27/2023, 8:04 AM
Between separate pipeline runs - probably there’s not built-in way, but for a single pipeline you can use: https://docs.kedro.org/en/stable/kedro.io.CachedDataset.html#kedro.io.CachedDataset If you’re interested in building functionality to cache the results of
SQLQueryDataSet
e.g. to disk, you can extend this class. Keep in mind that you will expose yourself to all kinds of problems related to cache invalidation then 🙂
e

Eluard Camota

11/28/2023, 3:11 AM
Thank you, will check this out. ☺️
Does it come with kedro or do I need to install it manually? cause I'm having problem.
Copy code
SQLQueryDataSet.__init__() got an unexpected keyword argument 'layer'.
Dataset '_cached' must only contain arguments valid for the constructor of 'kedro_datasets.pandas.sql_dataset.SQLQueryDataSet'..
or only CSVDataSet can be Cached
m

marrrcin

11/28/2023, 8:20 AM
The error is self-explanatory… Don’t use
layer
.
e

Eluard Camota

11/29/2023, 1:53 AM
oh yeah, thanks