I have a question about general design patterns wi...
# questions
b
I have a question about general design patterns with kedro. My situation is this: I have many "data descriptors" in my project (there could be up to 100 in the future), and these data descriptors are used by most nodes in the pipeline; they describe general information about the data being generated and also more specific information used by only a certain subclass of nodes such as extraction parameters, transformation parameters, loading parameters etc. Typically only one will be used in a single pipeline run; the idea is that you can easily configure which data descriptor you'll use for a particular run, either through some variable or run time parameters. These data descriptors will be python objects because I need to make use of inheritance and mixins (YAML and JSON cant be used). What is the best way approach these descriptors in kedro? i.e., should i treat them as input, and load them in the catalog? should i treat them as configuration parameters? etc...
d
So this is a complex use-case This example by @marrrcin may be of use https://getindata.com/blog/kedro-dynamic-pipelines/ It sounds like youโ€™er going to need to go down the custom omergaconf resolver route
b
thanks @datajoely that link is very useful ๐Ÿ‘
๐Ÿ‘ 1
I'm just wondering with regards to namespaced modular pipelines; can you still somehow use non-namespaced (i.e., global) parameters within them?
m
You can, but you have to map global parameters to "namespaced" in the constructor. I believe that the example in the blog shows this
๐Ÿ‘ 1