Hey everybody I have a general best practice question Loving Kedro #questions

Hey everybody I have a general/best practice quest...

Zubin Roy

05/15/2025, 1:05 PM

Hey everybody I have a general/best practice question. Loving using Kedro and have created a neat pipeline that trains my models the way I want to. I want to scale this up across x number of markets as my model looks to predict results at the market level. So I thought I'd use a modular pipeline in order to reuse my pipeline code. I like this approach as means I don't have to rewrite any code just have to copy across my parameter and catalog file. The only problem with this approach is my catalog and params file is now getting quite large and was wondering if there was a way to dynamically populate catalog or parameter entries or if there was a better approach within Kedro or generally to avoid large catalog/parameter files. Or perhaps this is just the trade off when we use modular pipelines. Thanks!

datajoely

05/15/2025, 2:01 PM

this article may help https://getindata.com/blog/kedro-dynamic-pipelines/

❤️ 1

Zubin Roy

05/15/2025, 4:15 PM

That is exactly what I was after! Can't wait to try this out thank you!

Guillaume Tauzin

05/16/2025, 8:11 AM

Hi @Zubin Roy, FYI, when following this guide, you can directly use YAML inheritance instead of a

merge_dict

OmegaConf resolver if you think that's better. Both will work, so it's really up to you to decide what you prefer. A small example of what it would look like: https://github.com/gtauzin/kedro-dagster-example/blob/main/conf/dev/parameters_model_tuning.yml

❤️ 2

Zubin Roy

05/16/2025, 8:30 AM

Great thank you!

👍 1

Matthias Roels

05/17/2025, 8:23 AM

What we do is create a kedro env per market to scale our pipeline. This way, you can re-use all your pipeline code and adjust params/catalog etc per market

👍 1

Zubin Roy

05/20/2025, 4:29 PM

Thanks @datajoely and @Guillaume Tauzin that's worked a treat. Really appreciate the advice!

💛 1

6 Views

Open in Slack

Previous Next