Kedro is an open-sourced Python framework for creating maintainable and modular data science code.

Kedro

Hi team, is there a way of running only nodes needed to generate missing data for a set node or nodes? A bit like --to-nodes but rather than running everything prior to the set nodes, checking what inputs already exists and running nodes for those that don't?

so it doesn’t exist but it’s relatively easy to  to build this via a hook or plug-in, you could go through the pipeline.inputs/outputs and call the `dataset.exists()` method to retrieve the list of missing datasets then construct the relevant run configuration

Short answer is no,it is because what's necessary is unclear because the data may exist but outdated.

If you have a fail run, it will generate a suggestion for nodes that need to be run again accounting for persisted data.

<@U055RSRTHJ4> what did you end up implementing?

Haven't yet sorry. Wasn't something needed just desired. But may implement in near future - will let you know if do :slightly_smiling_face: