Is there any reason why dataset factories don t work with pa Kedro #questions

Is there any reason why dataset factories don't wo...

Carlos Bermejo

08/27/2024, 11:07 PM

Is there any reason why dataset factories don't work with parameters? I'm trying to avoid having 20x namespace.company: yyyy and namespace.year:yyyy (company, year and many other paramateres are all "inherited" from the runtime parameters, but they stilled need to be passed on to my nodes). Also I could set them as parameters={company, year} but setting them on the non-spaced-name pipeline doesnt work for me, I need to write the parameters={company, year} each time I namespace-call the pipeline. Any pointers to reduce all of the duplicative code? Thanks

Carlos Bermejo

08/27/2024, 11:24 PM

Maybe I can create the pipelines with an f-string so that I only have to write the parameters and the inputs once but reuse them in a loop? (I don't know if that's even possible)

Yury Fedotov

08/28/2024, 5:10 AM

• Creating namespaced pipelines in Python

for

loop + leveraging f-strings is for sure possible. Then you just

sum

them. • I'd also look at the following: when you use

pipeline()

wrapper to create a namespaced

Pipeline

, you can use

parameters

kwarg to ask it not to namespace parameters that are common among all instances. See here.

👍 1

Elena Khaustova

08/28/2024, 10:09 AM

Hello Carlos, datasets are processed slightly differently from parameters, so, indeed, you cannot create a parameters factory. If your parameters remain the same across the namespaces, you don’t have to namespace them in the pipeline. You can use their original names, but for that, you need to explicitly mention them when creating a pipeline, please see this example: https://docs.kedro.org/en/stable/nodes_and_pipelines/namespaces.html#what-is-a-namespace

2 Views

Open in Slack

Previous Next