Hey Everyone I am trying to load the configuration dict in t Kedro #questions

Hey Everyone! I am trying to load the configurati...

Piotr Grabowski

12/04/2023, 12:09 PM

Hey Everyone! I am trying to load the configuration dict in the code (specifically in pipelines.py to build a dynamic pipeline). I see this suggestion on how to do this: https://docs.kedro.org/en/stable/configuration/parameters.html#how-to-load-parameters-in-code However, this seems not to load the runtime CLI parameters at all. When looking at the OmegaConfigLoader I can see that it has by default _*runtime_params=None*_ so naturally the snippet from that link will ignore the runtime params. Is there a way so that I could keep the runtime_params alive while loading the OmegaConfigLoader directly? Like in the example? I know that the Kedro context holds the runtime_params passed in the CLI but this is not available in the pipeline.py module, right? Any suggestions?

marrrcin

12/04/2023, 1:29 PM

What use case are you trying to achieve by employing dynamic pipelines?

Piotr Grabowski

12/04/2023, 1:32 PM

We are using Kedro for large sandbox where we mix and match subpipelines for running a machine learning study. This means that we would like to have some pipelines parametrised by a set of configs which defines which nodes to use when building the pipeline, e.g.: pipeline name: load_graphs usage: kedro run --pipeline=load_graph --params graphs:"g1-g2-g3" or usage: kedro run --pipeline=load_graph --params graphs:"g1-g3" This would run a loading pipeline which creates various combinations of graphs for our runs.

Piotr Grabowski

12/04/2023, 1:34 PM

We have a collection of such load_graphs pipelines, a collection of ML pipelines etc. and are using it by having various combinations thereof which produce sets of results later reported in big performance metric tables, like in typical ML papers.

marrrcin

12/04/2023, 1:34 PM

Have you explored using

--tags

for that?

Piotr Grabowski

12/04/2023, 1:35 PM

Hmm not yet, will have a look now!

Piotr Grabowski

12/04/2023, 1:35 PM

thanks

marrrcin

12/05/2023, 8:38 AM

Please let us know if it satisfies your use case, if not, then we’ll think more 😄

Piotr Grabowski

12/05/2023, 8:49 AM

I think it can, I didnt manage to re-implement that pipeline using tags yet because of time. However, it would be great if that config loaded "in code" (https://docs.kedro.org/en/stable/configuration/parameters.html#how-to-load-parameters-in-code) could contain all the parameters needed, including the runtime parameters. It's probably not trivial because it would require a way to pass

runtime_params

declared from CLI which is not possible I think?

marrrcin

12/05/2023, 9:00 AM

Kedro does not support using config loaders during pipeline creation and it’s by design. A lot of use cases for dynamic pipelines can be achieved using other features (such as tags) - generating pipelines on the fly via some parameters (although tempting and “easy”), increases the error surface and makes the pipelines less (or not) reproducible. You can also check out https://getindata.com/blog/kedro-dynamic-pipelines/ - it’s one of the approaches that’s fully compatible with Kedro and doesn’t do any hacking.

👍 1

Piotr Grabowski

12/05/2023, 9:00 AM

Thanks 🙂

Piotr Grabowski

12/05/2023, 9:01 AM

What would be the use case of that "in code" params loading that's shown in the documentation then? For debugging?

marrrcin

12/05/2023, 9:01 AM

For jupyter mostly

marrrcin

12/05/2023, 9:02 AM

Or custom entrypoints / plugins etc.

Piotr Grabowski

12/05/2023, 9:02 AM

ah right, ok 🙂

Piotr Grabowski

12/05/2023, 9:02 AM

I will give it a try with tags in few hours then, need to evangelize the group to stop pushing for dynamic pipelines so much 🙂

K 1

3 Views

Open in Slack

Previous Next