https://kedro.org/ logo
#plugins-integrations
Title
# plugins-integrations
m

Marc Gris

11/10/2023, 7:35 AM
Hi everyone, I’m looking into
kedro-airflow
and am having a little concern. Please correct me if I’m wrong. Since individual nodes are turned into airflow tasks as instances of the
KedroOperator
whose
execute()
method creates a new
session
, if one were to use versioned datasets, one would be in for a little surprise: A single airflow run would produce non-homogeneously timestamped artifacts… Correct ? Thanks in advance for your inputs / comments. M.
m

marrrcin

11/10/2023, 10:07 AM
Actually it mostly depends on the template you will use to create Airflow DAG. Default one is not so great imho (it leaves a lot of operational work to get it working). You can create a template which will create the version ID and then pass it to all downstream nodes for example, to make the versioning feature usable. Also, for advanced use cases see https://getindata.com/blog/deploying-kedro-pipelines-gcp-composer-airflow-node-grouping-mlflow by @Artur Dobrogowski
(it does not have to be GCP Composer btw) ☝️ Any Airflow on k8s will do
m

Marc Gris

11/10/2023, 10:11 AM
Hi @marrrcin Thanks for your answer and suggestions. Will check it out / try it out and get back to you 🙂 Have a nice day. Cheers M.
👍 1
j

Juan Luis

11/10/2023, 11:26 AM
@Marc Gris one of the people that was contributing the most to
kedro-airflow
is not on this Slack, so for specific questions you mightbe luckier opening an issue on GitHub and tagging
@sbrugman
🙂
👍🏼 1
(on top of what @marrrcin said)
a

Artur Dobrogowski

11/21/2023, 1:15 PM
@Marc Gris did you find the blog post about kedro-airflow useful?
👍🏼 1
m

Marc Gris

11/22/2023, 12:29 PM
@Artur Dobrogowski Yes. Very much so. Thx !!! We haven’t yet had the time / change to experiment with this approach, as we’re still in dev phase… Will post some news /feedback here once deployed.
K 1
2 Views