Hi. I am using modular pipelines and I need to surface a list of parameters in my pipeline registry to loop through to create the pipelines I am running modularly. The only solution I can think of is to run a hook to make an SQL call to get the list of parameters, then update the parameters.yml with the parameters I need. Then import parameters.yml into my pipeline registry and pull the parameters I need to create the pipelines. Before I get too deep, is this even possible (pretty sure it is), and also, is there a simpler way? I am already making the sql call from a node already. But I don't see a way to run nodes prior to creating a second pipeline once the data is in memory. Thanks for your help.
04/11/2023, 8:59 AM
Hi @Tim we can help you work through this, but in general we’re not fans of dynamic pipelines like this as it really damages the principles of reusability we’re trying to encourage.
In general we try to push users to follow the following rules:
• Nodes themselves should be pure python functions (with no side effects, or concept of IO), that should be delegated to the catalog. Your SQL call from a node here violates this assumption already.
• If you are going to generate a dynamic catalog, the
hooks are the right way to do that
Your second pipeline should pick up persisted data generated at the end of your first pipeline
04/11/2023, 12:44 PM
Thanks @datajoely, my team puzzled over how to get our postgres credentials using kedro and had to implement something before we could figure this out. For the time being we instantiate a database connection string class in hooks and add this to the catalog which we pass around where we need it. Your reply suggests I can do something with the datacatalog to create the pipelines dynamically (also I noticed that a lot of people are asking about dynamic pipelines, and this is not encouraged). Thanks for feedback.