https://kedro.org/ logo
#questions
Title
# questions
m

Mathilde Lavacquery

12/12/2022, 2:54 PM
Hi Kedro Team, what would be the best practice to pass parameters both in pipeline_registery and in the catalog ? e.g., I have a pipeline that runs for different countries and different brands, some pipelines / datasets are at country level, some are at country x brand level. All my pipelines are using namespacing to deal with the “scope” (ie the countries / brands) My pipeline registery looks like that:
Copy code
def register_pipeline():

    countries = ["a", "b"]
    brands = ["1", "2", "3"]
    return {
        "preprocess_macro": preprocess_macro_pipeline(countries=countries),
        "preprocess_brand": preprocess_brand_pipeline(countries=countries, brands=brands),
        "train_model": train_model_pipeline(countries=countries, brands=brands),
    }
and my catalog looks like that:
Copy code
{% for country in ["a", "b"] %}
{% for brand in ["1", "2", "3"] %}

{{ country }}.pre_master_macro:
    ...

{{ country }}.{{ brand }}.master:
    ...

{{ country }}.{{ brand }}.model:
    ...
Would there be a way to single pass countries / brands in both ? The usecase is that we are developing a generic pipeline that can be replicated in different regions / for different brands according to the client
In reality we have more dimensions on which the pipelines can run, which are used as namespaces in this following hierarchy: • countries (up to 2-3) • brands (up to 10 brands) • usecases (2 main) • target (3 different)
d

datajoely

12/12/2022, 3:01 PM
For dynamic pipelines this is the least-worst way of doing things without tweaking kedro
You can explore a hook that will create the catalog for you so that you don’t need to replicate things in two palces
i

Ian Whalen

12/12/2022, 3:12 PM
I asked a similar question over here Might be helpful
m

Mathilde Lavacquery

12/12/2022, 3:36 PM
Great thank you for your answers !
3 Views