Mohamed El Guendouz
01/27/2025, 2:42 PMHall
01/27/2025, 2:42 PMDmitry Sorokin
01/27/2025, 3:38 PMMohamed El Guendouz
01/27/2025, 3:53 PMuserflow --conf-source=conf-userflow.tar.gz --pipeline=<pipeline_name> --from-nodes=<node_A> --to-nodes=<node_A> --async
userflow --conf-source=conf-userflow.tar.gz --pipeline=<pipeline_name> --from-nodes=<node_B> --to-nodes=<node_B> --async
• If Node A produces a text string as output, Kedro will fail when running Node B directly, as it would require Node A to have executed beforehand to generate the dataset.
1. Airflow DAG Generation:
◦ I am generating an Airflow DAG from the Kedro pipeline. I need to ensure that Node B depends on Node A, but this dependency should be inferred directly from Kedro and not rely on the presence of datasets like dummy outputs (e.g., MemoryDataSet
). These outputs can complicate DAG generation and cause failures in isolated Node executions.
That said, if you have any other ideas or approaches that could help address this challenge, I’d love to hear them! 😊Dmitry Sorokin
01/27/2025, 4:22 PMDataCatalog
as a TextDataset
. This will allow you to execute your pipeline starting from node B as well - just ensure the file is in place. As you can see this is more of a workaround since Kedro assumes it can automatically resolve the node order based on inputs and outputs. As far as I know, this is the only way to establish the dependencies between nodes.Mohamed El Guendouz
01/27/2025, 5:44 PMMohamed El Guendouz
01/27/2025, 5:47 PMDmitry Sorokin
01/27/2025, 6:44 PM