Hi everyone, I’m looking into `kedro-airflow` and...
# plugins-integrations
m
Hi everyone, I’m looking into
kedro-airflow
and am having a little concern. Please correct me if I’m wrong. Since individual nodes are turned into airflow tasks as instances of the
KedroOperator
whose
execute()
method creates a new
session
, if one were to use versioned datasets, one would be in for a little surprise: A single airflow run would produce non-homogeneously timestamped artifacts… Correct ? Thanks in advance for your inputs / comments. M.
m
Actually it mostly depends on the template you will use to create Airflow DAG. Default one is not so great imho (it leaves a lot of operational work to get it working). You can create a template which will create the version ID and then pass it to all downstream nodes for example, to make the versioning feature usable. Also, for advanced use cases see https://getindata.com/blog/deploying-kedro-pipelines-gcp-composer-airflow-node-grouping-mlflow by @Artur Dobrogowski
(it does not have to be GCP Composer btw) ☝️ Any Airflow on k8s will do
m
Hi @marrrcin Thanks for your answer and suggestions. Will check it out / try it out and get back to you 🙂 Have a nice day. Cheers M.
👍 1
j
@Marc Gris one of the people that was contributing the most to
kedro-airflow
is not on this Slack, so for specific questions you mightbe luckier opening an issue on GitHub and tagging
@sbrugman
🙂
👍🏼 1
(on top of what @marrrcin said)
a
@Marc Gris did you find the blog post about kedro-airflow useful?
👍🏼 1
m
@Artur Dobrogowski Yes. Very much so. Thx !!! We haven’t yet had the time / change to experiment with this approach, as we’re still in dev phase… Will post some news /feedback here once deployed.
K 1