Kedro is an open-sourced Python framework for creating maintainable and modular data science code.

Kedro

How can a predictive modelling project be designed for easy switching between steps, such as missing values imputation methods, class balancing methods, model types and so on? 
Should nodes be used to dispatch data to different implementations based on parameters, or should nodes containing the concrete logic be used? Alternatively, would a pipeline factory that produces a pipeline made of concrete nodes be more suitable?

&gt; Should nodes be used to dispatch data to different implementations based on parameters
Wouldn't generally recommend this, even though I've seen it (especially from some power users within QuantumBlack), because you end up with a lot of logic in YAML config, and nodes end up becoming pretty meaningless wrappers to call functions pointed to in YAML with parameters defined in YAML.
&gt; Alternatively, would a pipeline factory that produces a pipeline made of concrete nodes be more suitable?
Haven't seen this, but it's an interesting idea.

Yes this is what I was thinking as well, thanks for the answer! I’ll investigate to see if there are solutions that appear to be better