Hello, everyone! I’m working with three pipelines...
# questions
j
Hello, everyone! I’m working with three pipelines in Kedro: data-processing, model-training, and model-validation. The model-validation pipeline uses real-world data, loads the exported model from model-training, and then predicts and evaluates the model’s performance on this data. My question is about using kedro-airflow to create an Airflow DAG. Since it’s not necessary to retrain the model every time the task is scheduled (only preprocessing the data and predicting values are needed), how can I select which pipelines go into the DAG? Is it possible to manually declare the pipelines instead of using find_pipelines() in order to export them to a DAG, without having to create a separate project just for this prediction implementation? I’d really appreciate any guidance or suggestions. Thank you!
d
I'm not very familiar with the plugin, but a cursory look shows that you can specify
pipeline_name
. Then, you can add add your own named pipeline (see https://docs.kedro.org/en/stable/nodes_and_pipelines/pipeline_registry.html). You can do this in conjunction with, or instead of,
find_pipelines()
. (On the off chance you can't seem to figure out how to pass
pipeline_name
, you can also just register
data_processing + model_validation
as your
__default__
pipeline, but I'm 99.9% sure you can pass it.)
👍 1