Hugo Evers
11/08/2023, 3:13 PMkedro-boot allows one to override pipeline inputs dynamically, maybe its also possible to overwrite arguments passed to datasets?
im specifically asking because kedro-boot seems to be the answer to using kedro with something like FastApi, but FastApi is routinely used for CRUD on a database, and kedro has this nice mechanism for handling credentials and such. So actually passing the credentials into a node to access a database is quite ugly.Merel
11/08/2023, 3:19 PMHugo Evers
11/08/2023, 3:48 PMTakieddine Kadiri
11/08/2023, 4:16 PMkedro-boot introduce a new kind of parmeters called template params that are resolved at each run iteration to cover exactly your use case.
You can leverage SqlQueryDataset. Let's say you have this dataset
your_dataset:
type: pandas.SQLQueryDataSet
sql: SELECT * from TABLE WHERE SOME_COLUMN=[[ column_value ]]
Note the column_value is now a template param that will be resolved at iteration time. template params are defined with [[ ]] Jinja template
Then in you fastapi code, you can render the column_value template param with a fastapi path or query parameters. You'll have something like
@app.get("/your_endpoint/{your_parameter}")
def your_endpoint(your_parameter),
return kedro_boot_session.run(name=<your_pipeline_view>, template_params={"column_value": your_parameter})
You can adapt this to you exact case. You can also make you sql dataset more secure by using parametrized queries instead of injecting direcly the column_value from the web.
This let you leverage kedro's amazing capabilities for handling backend IO, business logic and even application lifecycle (if you opt for the embeded mode) while using a full fastapi app that handle the controller and the serving part.
Hope this helps simple smileHugo Evers
11/09/2023, 2:15 PMTakieddine Kadiri
11/09/2023, 2:56 PM@app.get("/your_endpoint/{your_parameter}")
def your_endpoint(your_parameter),
outputs_data = kedro_boot_session.run(name="your_first_pipeline_view")
return kedro_boot_session.run(name="your_second_pipeline_view", template_params={"column_value": outputs_data})
Your pipeline_registry.py:
from kedro.pipeline.modular_pipeline import pipeline
from kedro_boot.pipeline import app_pipeline
your_first_pipeline = pipeline([node(your_function, inputs="your_inputs", outputs="your_output")])
your_second_pipeline = pipeline(.....)
app_first_pipeline = app_pipeline(
inference_pipeline,
name="your_first_pipeline_view",
outputs="your_output",
)
app_second_pipeline = app_pipeline(
your_second_pipeline,
name="your_second_pipeline_view",
)
return {"__default__": app_first_pipeline + app_second_pipeline}
Correct me if i misunderstood the question, otherwise let me know if it works for you.Hugo Evers
11/09/2023, 3:05 PM