Hello channel :wave: I have two Kedro pipelines, ...
# questions
m
Hello channel 👋 I have two Kedro pipelines, say TrainA and TrainB. The pipelines are quite similar in that • most node pairs are identical, i.e. they use the same function and the same input/output configuration • some nodes use the same function and their input/output configuration differs only by the parameters that are read in. Right now, the two pipelines live in two different Python modules, namely tr`train_a.py` and
train_b.py
. My idea is to separate the process from the product, i.e. creating a general pipeline for training that can be used to train both A and B given their respective configuration parameters—they can possibly live in their own
parameters.yaml
so to avoid override and/or conflict. The two questions related are: 1. does this harmonise with the Kedro principles? 2. is this possible in Kedro? Thank you. 🙏
d
I think this is a perfect use-case for our namespaced pipelines construct (also known as modular pipelines) as you can
.replace(parameters={'base_param':'some_other_param'})
🤩 1
👍 1
if you provide namespaces you also get the nice big boxes on Kedro-Viz (see demo.kedro.org)
m
Thank you! I will try straight away and I will follow up in this thread. 🤩
@datajoely I gave it a try, created a namespaced modular pipeline, but there is one thing I don't get. Suppose I have my general
train.py
pipeline instead of
train_a.py
and
train_b.py
. However, I still have two different parameter configuration YAML files that live respectively inside
conf/train_a/parameters.yaml
and
conf/train_b/parameters.yaml
. How can I call the training pipeline twice, each time with one specific configuration? I hope it makes sense.
d
so the idea of this pattern is that the code is static, but it possible to override the catalog or parameter information sort of like a sub-class
m
yes, this rephrasing is correct—the code is made in such a way that when you call e.g.
params:my_params
inside one node, both configuration YAML file will be passing the required parameter.
@datajoely I was following up on this thread, as I have not solved it yet. Did you have a chance to think about it? 🙂
d
can you not use the same parameter key twice?
m
I am afraid I don't fully understand, can you elaborate on that?