Hi, We're using Kedro to run pipelines starting wi...
# questions
p
Hi, We're using Kedro to run pipelines starting with loading data via the catalogue on a database. To allow us to be flexible, we're using global variables in the queries to load parts of the data 'dynamically'. The kedro pipeline is created and run from another python the script, allowing us to use a FastAPI interface. Kedro loads these globals via the CONFIG_LOADER_CLASS in settings.py. However, we noticed that on the first run of
KedroSession.create()
the global variables are loaded correctly, but on a second run, the global variables are not loaded. the whole settings.py is not executed. If i'm not mistaken, the global variables are seen as 'Project Settings' and are loaded once in the lifetime of a project. When creating a new session, these are not reloaded. Is there a way to reset/reload the global variables when creating a second session?
K 1
đź‘€ 1
f
This is a bit a sideways answer, but you could use
runtime_params
(from the
OmegaConfigLoader
) instead of global variables
đź‘Ť 1
y
This is exactly the kind of problem kedro-boot is solving, feel free to try and give feedback: https://github.com/takikadiri/kedro-boot
đź‘Ť 1
f
You can pass those to
KedroSession.create
as
extra_params
p
@FlorianGD: you can't load them as globals then isn't it? only as parameters, which you can't use as variables in your catalogue.
f
Yes, bit if you want to load parts and those parts change at each run, it looks to me more like runtime params than globals (I may be missing something though). We manage to build some queries with a mix of env variables and runtime params, depending on the use case
p
Nice. How do i get a query like
SELECT * FROM Table WHERE col = ${variable}
using parameters and not globals?
f
You can use the
OmegaConfigLoader
and the built-in resolver `runtime_params`https://docs.kedro.org/en/stable/configuration/advanced_configuration.html#how-to-override-configuration-with-[…]rameters-with-the-omegaconfigloader So the query would look like
Copy code
SELECT * FROM Table WHERE col = ${runtime_params:variable}
And then
kedro run --params variable:my_col
p
Can this be done starting kedro without CLI as well? But directly from within Python?
Copy code
@app.get("...")
async def run_project(var: str):
    params = GetFeatures(var)
    with KedroSession.create(
        package_name=bootstrap_project(Path.cwd()).package_name,
        env=env,
        extra_params=params,
    ) as session:
        session.run(pipeline_name=pipeline)
So we pass parameters to the project, no issue. but the globals are picked from there and added to the project only once. the second time this runs, that part of the Kedro-code is not executed.
f
Yes, add it to
extra_params
:
params = {**params, "variable": "my_col"}
p
Then i get the following error straight away:
ValueError: Failed to format pattern '${variable}': no config value found, no  default provided
f
The syntax is
${runtime_params:variable}
, otherwise, it would look for an entry named
variable
in the yaml file
And you need to change
settings.py
to use
OmegaConfigLoader
if you use kedro pre 0.19
n
Are you suggesting doing
KedroSession.create()
twice resulting in different ${globals} ?
settings
are indeed loaded once when started, and
bootstrap_project
will reload this. However I don't think this is related to the problem that you are describing which seems to related to the config loader? https://docs.kedro.org/en/latest/kedro_project_setup/session.html#bootstrap-project-and-configure-project
p
Nok, i think you're indeed on the right path. i want to create two (or more) consecutive sessions, all using different global parameters. We got a working solution by reimporting
from kedro.framework.startup import bootstrap_project
, after creating and before running a session, calling
session.load_context()
, and implement a
after_context_created(self)
hook that actually takes the variables and adds these to the project settings (
from kedro.framework.project import settings
)