hey i am trying to find a solution for running the same pipe Kedro #questions

hey, i am trying to find a solution for running th...

dor zazon

01/05/2023, 8:29 AM

hey, i am trying to find a solution for running the same pipeline with different parameters. i have a preprocess pipeline that i want to run over 5 different datasets. i want that if in the future i will have more datasets i can add them to the catalog and the pipeline will process it as well. how can i create a template pipeline and set the preprocess pipeline to run over a list of datasets names from the catalog?

marrrcin

01/05/2023, 9:02 AM

Will

PartitionedDataSet

be enough for you?

dor zazon

01/05/2023, 9:04 AM

not exactly, in further steps i will need to run over each dataset and extract features from each dataset according to the config .yml file.

dor zazon

01/05/2023, 9:05 AM

i want to control the whole project from the config file. how can i use dynamic inputs rather than specific names of variables

marrrcin

01/05/2023, 9:12 AM

Yeah, dynamic pipelines are something recurring here in the #C03RKP2LW64 channel and Kedro does not support them in general. There are some workarounds but they are rather hacky 😉

marrrcin

01/05/2023, 9:13 AM

https://kedro-org.slack.com/archives/C03RKP2LW64/p1667910377475889

Jordan

01/05/2023, 4:16 PM

This is a question that recurs so often that I think it would be worth

Jordan

01/05/2023, 4:18 PM

implementing this functionality in some future version of Kedro. It seems like a somewhat common use case to want to processes many different partitioned datasets with the same pipeline without having to manually add all inputs and outputs to the catalog

6 Views

Open in Slack

Previous Next