Francis Duval
01/09/2024, 9:47 PMMerel
01/10/2024, 2:35 PMFrancis Duval
01/10/2024, 2:46 PMkedro run --pipeline pipeline_name
, then the results in your pipeline are up to date since you just ran it. But then, you edit your pipeline and some parts of it (those that are downstream the node/dataset you modified) will be outdated. In order for your pipeline to be up to date again, you'll have to rerun it. However, if you use again kedro run --pipeline pipeline_name
, it will run everything, including nodes that are not impacted (not downstream) by the node/dataset you modified. This can be a problem if your pipeline contains nodes that take a long time to compute.
I would like to have a tool that tells me which parts of my pipeline are up to date and which ones are outdated. Would be nice to have this in kedro viz indeed! A command like kedro run ---only_outdated
that would only run outdated nodes would be great too. 🙂Merel
01/10/2024, 2:49 PMFrancis Duval
01/10/2024, 2:53 PMMerel
01/10/2024, 2:57 PMCachedDataset
https://docs.kedro.org/en/stable/_modules/kedro/io/cached_dataset.html# can help. But the automatic detection of "outdated" nodes isn't something we support right now. kedro run
does offer several options to run a pipeline only from a certain node. kedro run --help
shows all of them.Francis Duval
01/10/2024, 4:27 PMFrancis Duval
01/10/2024, 4:47 PM