Hugo Evers
11/08/2023, 3:13 PMkedro-boot
allows one to override pipeline inputs dynamically, maybe its also possible to overwrite arguments passed to datasets?
im specifically asking because kedro-boot
seems to be the answer to using kedro
with something like FastApi
, but FastApi
is routinely used for CRUD on a database, and kedro has this nice mechanism for handling credentials and such. So actually passing the credentials into a node to access a database is quite ugly.Merel
11/08/2023, 3:19 PMHugo Evers
11/08/2023, 3:48 PMTakieddine Kadiri
11/08/2023, 4:16 PMkedro-boot
introduce a new kind of parmeters called template params
that are resolved at each run iteration to cover exactly your use case.
You can leverage SqlQueryDataset. Let's say you have this dataset
your_dataset:
type: pandas.SQLQueryDataSet
sql: SELECT * from TABLE WHERE SOME_COLUMN=[[ column_value ]]
Note the column_value
is now a template param
that will be resolved at iteration time. template params
are defined with [[ ]] Jinja template
Then in you fastapi code, you can render the column_value
template param with a fastapi path or query parameters. You'll have something like
@app.get("/your_endpoint/{your_parameter}")
def your_endpoint(your_parameter),
return kedro_boot_session.run(name=<your_pipeline_view>, template_params={"column_value": your_parameter})
You can adapt this to you exact case. You can also make you sql dataset more secure by using parametrized queries instead of injecting direcly the column_value
from the web.
This let you leverage kedro's amazing capabilities for handling backend IO, business logic and even application lifecycle (if you opt for the embeded mode) while using a full fastapi app that handle the controller and the serving part.
Hope this helps simple smileHugo Evers
11/09/2023, 2:15 PMTakieddine Kadiri
11/09/2023, 2:56 PM@app.get("/your_endpoint/{your_parameter}")
def your_endpoint(your_parameter),
outputs_data = kedro_boot_session.run(name="your_first_pipeline_view")
return kedro_boot_session.run(name="your_second_pipeline_view", template_params={"column_value": outputs_data})
Your pipeline_registry.py:
from kedro.pipeline.modular_pipeline import pipeline
from kedro_boot.pipeline import app_pipeline
your_first_pipeline = pipeline([node(your_function, inputs="your_inputs", outputs="your_output")])
your_second_pipeline = pipeline(.....)
app_first_pipeline = app_pipeline(
inference_pipeline,
name="your_first_pipeline_view",
outputs="your_output",
)
app_second_pipeline = app_pipeline(
your_second_pipeline,
name="your_second_pipeline_view",
)
return {"__default__": app_first_pipeline + app_second_pipeline}
Correct me if i misunderstood the question, otherwise let me know if it works for you.Hugo Evers
11/09/2023, 3:05 PM