Question about the newly released `kedro-boot`: On...
# plugins-integrations
h
Question about the newly released `kedro-boot`: One of my workflow relies on serving the result of a pipeline given an input-parameter to a dataset. specifically, the api gets a list of ids, and uses these in a query to a database for filtering. the query is executed in a dataset, but these ids are only known at runtime. now i can add a dataset, set these ids to some preset and then overwrite them using omegaconfigloader, or templatedconfigloader and pass extra_params to the kedrosession. However, since
kedro-boot
allows one to override pipeline inputs dynamically, maybe its also possible to overwrite arguments passed to datasets? im specifically asking because
kedro-boot
seems to be the answer to using
kedro
with something like
FastApi
, but
FastApi
is routinely used for CRUD on a database, and kedro has this nice mechanism for handling credentials and such. So actually passing the credentials into a node to access a database is quite ugly.
👍 1
m
@Takieddine Kadiri / @Yolan Honoré-Rougé
👍 1
h
I do see references to catalog templates/views in the tests and source code, so maybe that could be a route to achieve this?
👍 1
t
Hello Hugo !
kedro-boot
introduce a new kind of parmeters called
template params
that are resolved at each run iteration to cover exactly your use case. You can leverage SqlQueryDataset. Let's say you have this dataset
Copy code
your_dataset:
  type: pandas.SQLQueryDataSet
  sql: SELECT * from TABLE WHERE SOME_COLUMN=[[ column_value ]]
Note the
column_value
is now a
template param
that will be resolved at iteration time.
template params
are defined with [[ ]] Jinja template Then in you fastapi code, you can render the
column_value
template param with a fastapi path or query parameters. You'll have something like
Copy code
@app.get("/your_endpoint/{your_parameter}")
def your_endpoint(your_parameter),
	return kedro_boot_session.run(name=<your_pipeline_view>, template_params={"column_value": your_parameter})
You can adapt this to you exact case. You can also make you sql dataset more secure by using parametrized queries instead of injecting direcly the
column_value
from the web. This let you leverage kedro's amazing capabilities for handling backend IO, business logic and even application lifecycle (if you opt for the embeded mode) while using a full fastapi app that handle the controller and the serving part. Hope this helps simple smile
👍 2
h
nice, thanks! One follow-up question, lets say the result of a node is needed as a template_params input. Do you need need to make sure those are ran as separate pipelines or does the template_params also allows one to connect outputs to template_params?
t
If i understand correctly your question, your http request does not contains directly the parameter needed to render the template param, you want to run a node/pipeline and use it's output as a parameter to render a template param of an input dataset of another pipeline ? If it's the case you can use two pipeline views in yout kedro apps. Your fastapi app :
Copy code
@app.get("/your_endpoint/{your_parameter}")
def your_endpoint(your_parameter),
	outputs_data = kedro_boot_session.run(name="your_first_pipeline_view")
	return kedro_boot_session.run(name="your_second_pipeline_view", template_params={"column_value": outputs_data})
Your pipeline_registry.py:
Copy code
from kedro.pipeline.modular_pipeline import pipeline
from kedro_boot.pipeline import app_pipeline

your_first_pipeline = pipeline([node(your_function, inputs="your_inputs", outputs="your_output")])
your_second_pipeline = pipeline(.....)

app_first_pipeline = app_pipeline(
        inference_pipeline,
        name="your_first_pipeline_view",
        outputs="your_output",
    )

app_second_pipeline = app_pipeline(
        your_second_pipeline,
        name="your_second_pipeline_view",
    )

return {"__default__": app_first_pipeline + app_second_pipeline}
Correct me if i misunderstood the question, otherwise let me know if it works for you.
h
yes, but it was a hypothetical question, since in my use case the request does contains the parameters. But in the future we might need to tackle that issue. Your answer is in line with my expectation, thanks!
👍 1