https://kedro.org/ logo
#questions
Title
# questions
c

Camilo Piñón

01/04/2024, 11:35 AM
Hello! I'm currently exploring various modeling approaches and have a question about best practices. Specifically, I'm wondering whether it's better to create separate pipelines (pipeline folders) for each model or to include all the different models within the same pipeline. Keep in mind that they will all be utilizing the same dataset and producing predictions in the same format. Additionally, if there are any other design options I might not have considered, I'd love to hear your thoughts. I understand that the choice may vary depending on the specific use case, but I'm interested in your opinions. Thank you!
n

Nok Lam Chan

01/04/2024, 11:52 AM
It consume the same input dataset but they should still have their own intermediate dataests? Are you aware of namespace pipeline?
Something like the demo: https://demo.kedro.org/ Click into
Train Evaluation
you should see two parallel pipeline structure with different models.
image.png
c

Camilo Piñón

01/04/2024, 11:59 AM
Yes, my idea is that they have the same input but support for different intermediate datasets is permitted (some models will need different scaling/normalization techniques)
And no, I am not aware of namespace pipelines, but definitely will check it
Is there any place where the specific code for the demo is available?
Thank you so much for the insights Nok!
K 1
🥳 1
n

Nok Lam Chan

01/04/2024, 1:20 PM