Hi everyone, is there a way to chache an SQLQueryD...
# questions
e
Hi everyone, is there a way to chache an SQLQueryDataSet so it does not always takes time to fetch the same data everytime the pipeline runs? Thanks in advance.
m
Between separate pipeline runs - probably there’s not built-in way, but for a single pipeline you can use: https://docs.kedro.org/en/stable/kedro.io.CachedDataset.html#kedro.io.CachedDataset If you’re interested in building functionality to cache the results of
SQLQueryDataSet
e.g. to disk, you can extend this class. Keep in mind that you will expose yourself to all kinds of problems related to cache invalidation then 🙂
e
Thank you, will check this out. ☺️
Does it come with kedro or do I need to install it manually? cause I'm having problem.
Copy code
SQLQueryDataSet.__init__() got an unexpected keyword argument 'layer'.
Dataset '_cached' must only contain arguments valid for the constructor of 'kedro_datasets.pandas.sql_dataset.SQLQueryDataSet'..
or only CSVDataSet can be Cached
m
The error is self-explanatory… Don’t use
layer
.
e
oh yeah, thanks