Ivan Konovalov
04/05/2024, 2:32 PMWHERE
clause in Kedro dataset queries. Let's say I have the following catalogue entry:
some_table.raw:
type: pandas.GBQQueryDataset
sql: SELECT * FROM database.table WHERE date >= {start_date}
Then, in the code, I want to overwrite it with something like:
catalog.load("some_table.raw", query={"start_date": "2024-01-01"})
I know it's impossible since the load
method supports no arguments except for the dataset name. But perhaps there are some workarounds I don't know about.
I would really appreciate your help here!
I know there's a suggestion to use TemplatedConfigLoader
, but that will only allow me to load a dataset with one value of start_date
. But I want to be able to pass different values in different parts of code.Ankita Katiyar
04/05/2024, 2:45 PMTemplatedConfigLoader
has been removed in Kedro 0.19.x, maybe using runtime_params:
resolvee with OmegaConfigLoader
will be useful - https://docs.kedro.org/en/stable/configuration/advanced_configuration.html#how-to-override-configuration-with-[…]rameters-with-the-omegaconfigloaderAnkita Katiyar
04/05/2024, 2:46 PMIñigo Hidalgo
04/05/2024, 2:58 PMNok Lam Chan
04/05/2024, 3:05 PMcatalog
though because this is always config_loader to control the datasets definition. Otherwise you can always create the dataset directly in code without going through the YAML.Iñigo Hidalgo
04/05/2024, 3:10 PMparams
argument in `pd.read_sql_query`https://pandas.pydata.org/docs/reference/api/pandas.read_sql_query.html
We have some workarounds where we modify the catalog dataset definition before calling the load, but it's quite dirty, and I would like to eventually move to using Ibis instead.
This method is database-engine dependent though https://peps.python.org/pep-0249/#paramstyleNok Lam Chan
04/05/2024, 3:29 PMIñigo Hidalgo
04/05/2024, 3:30 PMIñigo Hidalgo
04/05/2024, 3:31 PMIvan Konovalov
04/08/2024, 8:22 AM