Hello everyone I m working with three pipelines in Kedro dat Kedro #questions

Hello, everyone! I’m working with three pipelines...

João Dias

08/14/2024, 5:25 PM

Hello, everyone! I’m working with three pipelines in Kedro: data-processing, model-training, and model-validation. The model-validation pipeline uses real-world data, loads the exported model from model-training, and then predicts and evaluates the model’s performance on this data. My question is about using kedro-airflow to create an Airflow DAG. Since it’s not necessary to retrain the model every time the task is scheduled (only preprocessing the data and predicting values are needed), how can I select which pipelines go into the DAG? Is it possible to manually declare the pipelines instead of using find_pipelines() in order to export them to a DAG, without having to create a separate project just for this prediction implementation? I’d really appreciate any guidance or suggestions. Thank you!

Deepyaman Datta

08/14/2024, 6:04 PM

I'm not very familiar with the plugin, but a cursory look shows that you can specify

pipeline_name

. Then, you can add add your own named pipeline (see https://docs.kedro.org/en/stable/nodes_and_pipelines/pipeline_registry.html). You can do this in conjunction with, or instead of,

find_pipelines()

. (On the off chance you can't seem to figure out how to pass

pipeline_name

, you can also just register

data_processing + model_validation

as your

__default__

pipeline, but I'm 99.9% sure you can pass it.)

👍 1

Open in Slack

Previous Next