I have a question about general design patterns with kedro M Kedro #questions

I have a question about general design patterns wi...

Ben Phillips

02/03/2024, 11:54 AM

I have a question about general design patterns with kedro. My situation is this: I have many "data descriptors" in my project (there could be up to 100 in the future), and these data descriptors are used by most nodes in the pipeline; they describe general information about the data being generated and also more specific information used by only a certain subclass of nodes such as extraction parameters, transformation parameters, loading parameters etc. Typically only one will be used in a single pipeline run; the idea is that you can easily configure which data descriptor you'll use for a particular run, either through some variable or run time parameters. These data descriptors will be python objects because I need to make use of inheritance and mixins (YAML and JSON cant be used). What is the best way approach these descriptors in kedro? i.e., should i treat them as input, and load them in the catalog? should i treat them as configuration parameters? etc...

datajoely

02/03/2024, 2:58 PM

So this is a complex use-case This example by @marrrcin may be of use https://getindata.com/blog/kedro-dynamic-pipelines/ It sounds like you’er going to need to go down the custom omergaconf resolver route

Ben Phillips

02/03/2024, 7:38 PM

thanks @datajoely that link is very useful 👍

👍 1

Ben Phillips

02/03/2024, 7:39 PM

I'm just wondering with regards to namespaced modular pipelines; can you still somehow use non-namespaced (i.e., global) parameters within them?

marrrcin

02/03/2024, 9:11 PM

You can, but you have to map global parameters to "namespaced" in the constructor. I believe that the example in the blog shows this

👍 1

2 Views

Open in Slack

Previous Next