Hi all! I have a question about Kedro's data cata...
# questions
i
Hi all! I have a question about Kedro's data catalogue. I want to be able to overwrite the
WHERE
clause in Kedro dataset queries. Let's say I have the following catalogue entry:
Copy code
some_table.raw:
    type: pandas.GBQQueryDataset
    sql: SELECT * FROM database.table WHERE date >= {start_date}
Then, in the code, I want to overwrite it with something like:
Copy code
catalog.load("some_table.raw", query={"start_date": "2024-01-01"})
I know it's impossible since the
load
method supports no arguments except for the dataset name. But perhaps there are some workarounds I don't know about. I would really appreciate your help here! I know there's a suggestion to use
TemplatedConfigLoader
, but that will only allow me to load a dataset with one value of
start_date
. But I want to be able to pass different values in different parts of code.
a
TemplatedConfigLoader
has been removed in Kedro 0.19.x, maybe using
runtime_params:
resolvee with
OmegaConfigLoader
will be useful - https://docs.kedro.org/en/stable/configuration/advanced_configuration.html#how-to-override-configuration-with-[…]rameters-with-the-omegaconfigloader
i
Hopefully soon we will have the Ibis dataset which is meant to address exactly this usecase. It's compatible with bigquery. @Deepyaman Datta @Juan Luis
👍 1
n
It should be possible, are you doing some interactive stuff from a Notebook? It wouldn't be
catalog
though because this is always config_loader to control the datasets definition. Otherwise you can always create the dataset directly in code without going through the YAML.
i
I just noticed you might be talking about interactive development. This is a bit of a workaround, but you might be able to leverage the
params
argument in `pd.read_sql_query`https://pandas.pydata.org/docs/reference/api/pandas.read_sql_query.html We have some workarounds where we modify the catalog dataset definition before calling the load, but it's quite dirty, and I would like to eventually move to using Ibis instead. This method is database-engine dependent though https://peps.python.org/pep-0249/#paramstyle
n
@Iñigo Hidalgo I may have missed something, why this is related to ibis-dataset? I thought this is more about parametrising the SQL query
i
Ibis would allow for creating these filters through code, which decouples the filtering logic from the catalog dataset logic. https://github.com/kedro-org/kedro/issues/2374
👀 1
👍🏼 1
(Huge meandering thread, but that thread marked the start of integrating ibis into kedro, and my motivation for doing so)
i
Thank you all very much for your help!🙏 Now I have plenty of stuff to read and try out 😄