Hello channel wave I have two Kedro pipelines say TrainA and Kedro #questions

Hello channel :wave: I have two Kedro pipelines, ...

mattia.paterna

06/11/2024, 3:00 PM

Hello channel 👋 I have two Kedro pipelines, say TrainA and TrainB. The pipelines are quite similar in that • most node pairs are identical, i.e. they use the same function and the same input/output configuration • some nodes use the same function and their input/output configuration differs only by the parameters that are read in. Right now, the two pipelines live in two different Python modules, namely tr`train_a.py` and

train_b.py

. My idea is to separate the process from the product, i.e. creating a general pipeline for training that can be used to train both A and B given their respective configuration parameters—they can possibly live in their own

parameters.yaml

so to avoid override and/or conflict. The two questions related are: 1. does this harmonise with the Kedro principles? 2. is this possible in Kedro? Thank you. 🙏

datajoely

06/11/2024, 3:04 PM

I think this is a perfect use-case for our namespaced pipelines construct (also known as modular pipelines) as you can

.replace(parameters={'base_param':'some_other_param'})

🤩 1

👍 1

datajoely

06/11/2024, 3:05 PM

https://docs.kedro.org/en/stable/nodes_and_pipelines/modular_pipelines.html#using-a-modular-pipeline-multiple-times

datajoely

06/11/2024, 3:06 PM

if you provide namespaces you also get the nice big boxes on Kedro-Viz (see demo.kedro.org)

mattia.paterna

06/11/2024, 3:06 PM

Thank you! I will try straight away and I will follow up in this thread. 🤩

mattia.paterna

06/13/2024, 6:45 AM

@datajoely I gave it a try, created a namespaced modular pipeline, but there is one thing I don't get. Suppose I have my general

train.py

pipeline instead of

train_a.py

and

train_b.py

. However, I still have two different parameter configuration YAML files that live respectively inside

conf/train_a/parameters.yaml

and

conf/train_b/parameters.yaml

. How can I call the training pipeline twice, each time with one specific configuration? I hope it makes sense.

datajoely

06/13/2024, 9:24 AM

so the idea of this pattern is that the code is static, but it possible to override the catalog or parameter information sort of like a sub-class

mattia.paterna

06/13/2024, 10:56 AM

yes, this rephrasing is correct—the code is made in such a way that when you call e.g.

params:my_params

inside one node, both configuration YAML file will be passing the required parameter.

mattia.paterna

06/19/2024, 12:45 PM

@datajoely I was following up on this thread, as I have not solved it yet. Did you have a chance to think about it? 🙂

datajoely

06/19/2024, 12:51 PM

can you not use the same parameter key twice?

mattia.paterna

06/19/2024, 12:54 PM

I am afraid I don't fully understand, can you elaborate on that?

Open in Slack

Previous Next