Hello kedro team I have a kedro issue let s see if you can h Kedro #questions

Hello kedro team! I have a kedro issue, let's see ...

Toni

09/09/2022, 1:21 PM

Hello kedro team! I have a kedro issue, let's see if you can help me... We have a kedro pipeline that trains a model and generates a dataframe as output. The problem we now have is that we need to loop that pipeline to generate multiple dataframes (that, at the end, we want to concatenate to have a single table). Is possible to, given a parameter of

set_targets = ['a', 'b', 'c']

, we can loop the same pipeline for each value of that list without "copying" that pipeline? We may have a different length and names for that "`set_of_targets`", and thus we want to avoid manual work... Also, we need the outputs to have "dynamic" names in the catalog in order to save all the outputs (

score_{{target}}

...

score_a

score_b

score_c

)... I think this could be done with

jinja

, but no idea where to start... Thank you very much!

Shubham Gupta

09/09/2022, 1:44 PM

You can simply reuse the same pipeline and provide alias to the data from your catalogue. https://kedro.readthedocs.io/en/0.17.6/06_nodes_and_pipelines/03_modular_pipelines.html#how-to-use-a-modular-pipeline-twice

Shubham Gupta

09/09/2022, 1:46 PM

Iteration for pipeline is not difficult. You can simply iterate many times you want. Just keep on changing these alias. Make sure you add these data values as part of catalogue.

Toni

09/09/2022, 2:11 PM

Thank you for your help @Shubham Gupta! Although it

namespaces

is a tool to keep in mind, I think that would fit my problem if I would know the set of targets through which I have to iterate and run the pipeline "beforehand". The problem is that I may not have that set of targets, and that I cannot add the

namespace parameters and catalog entries

until I have the set (which may be different from one project to another). The PO would like to have a "dynamic template", that only changing a

set_of_targets parameter

, it would automatically create those

namespaces

Nok Lam Chan

09/09/2022, 3:54 PM

I think there are 2 questions asked here. 1. How to avoid naming repeating datasets? Namespace (Modular) pipeline is the right thing to do here. 2. Dynamic pipeline is not really encourage here, but it’s not impossible to do

👍 1

Nok Lam Chan

09/09/2022, 3:55 PM

Open in Slack

Previous Next