Hey everybody I have a general/best practice quest...
# questions
z
Hey everybody I have a general/best practice question. Loving using Kedro and have created a neat pipeline that trains my models the way I want to. I want to scale this up across x number of markets as my model looks to predict results at the market level. So I thought I'd use a modular pipeline in order to reuse my pipeline code. I like this approach as means I don't have to rewrite any code just have to copy across my parameter and catalog file. The only problem with this approach is my catalog and params file is now getting quite large and was wondering if there was a way to dynamically populate catalog or parameter entries or if there was a better approach within Kedro or generally to avoid large catalog/parameter files. Or perhaps this is just the trade off when we use modular pipelines. Thanks!
d
z
That is exactly what I was after! Can't wait to try this out thank you!
g
Hi @Zubin Roy, FYI, when following this guide, you can directly use YAML inheritance instead of a
merge_dict
OmegaConf resolver if you think that's better. Both will work, so it's really up to you to decide what you prefer. A small example of what it would look like: https://github.com/gtauzin/kedro-dagster-example/blob/main/conf/dev/parameters_model_tuning.yml
❤️ 2
z
Great thank you!
👍 1
m
What we do is create a kedro env per market to scale our pipeline. This way, you can re-use all your pipeline code and adjust params/catalog etc per market
👍 1
z
Thanks @datajoely and @Guillaume Tauzin that's worked a treat. Really appreciate the advice!
💛 1