Hi everyone. I'm not sure if this is the right pla...
# questions
f
Hi everyone. I'm not sure if this is the right place to ask, but does anybody have experience with using Airflow vs Prefect
>= 2.0.0
to run Kedro pipelines? Currently, only Prefect 1.x is tested to work with
0.18.x
according to the docs which is making us hesitate a bit on that end. We're currently evaluating both as a higher level orchestration platform for our Kedro pipelines, and both seem great for generic workflows, so some community feedback would be much appreciated.
K 1
👍 2
The "overhead" to get started with Prefect as a wrapper seems quite a bit higher than with Airflow (in a Kedro context), but the linked docs above do a pretty good job of giving an example use-case. Most of our pipelines right now start Dask Clusters on running on EC2 instances. We're still figuring out the best balance between running multiple pipelines one after another via orchestrated Airflow/Prefect nodes and running a single "big" pipeline which in turn calls each single pipeline, but I imagine it'll end up being a hybrid approach at some level...
d
TBH I wouldn't look at the existing Kedro docs for Prefect deployment; Prefect 2.x should provide a much simpler way. How do you use Dask? Both the Kedro Dask deployment guide
DaskRunner
and Prefect's
DaskRunner
would be based on the idea of mapping nodes to tasks. If so, I think Prefect 2.x Python API provides a pretty straightforward way to construct that when mapping a Kedro pipeline (but there's no deployment guide for it).
👍 1
m
In my opinion, kedro and Prefect are to some extend overlapping in what they do. So if you ask me, it is more of an OR question rather than an AND. I would never use kedro and Prefect together! When I architect an ML system, my first decision would be the choice of orchestration framework and only then I would decide on additional frameworks. In that regards, I have two preferences; either go full Prefect (potentially with hydra and maybe kedro standalone catalog) or Argo Workflows + kedro (only works on the context of kubernetes)
👀 1