Hello I m currently exploring various modeling approaches an Kedro #questions

Hello! I'm currently exploring various modeling ap...

Camilo Piñón

01/04/2024, 11:35 AM

Hello! I'm currently exploring various modeling approaches and have a question about best practices. Specifically, I'm wondering whether it's better to create separate pipelines (pipeline folders) for each model or to include all the different models within the same pipeline. Keep in mind that they will all be utilizing the same dataset and producing predictions in the same format. Additionally, if there are any other design options I might not have considered, I'd love to hear your thoughts. I understand that the choice may vary depending on the specific use case, but I'm interested in your opinions. Thank you!

Nok Lam Chan

01/04/2024, 11:52 AM

It consume the same input dataset but they should still have their own intermediate dataests? Are you aware of namespace pipeline?

Nok Lam Chan

01/04/2024, 11:53 AM

Something like the demo: https://demo.kedro.org/ Click into

Train Evaluation

you should see two parallel pipeline structure with different models.

Nok Lam Chan

01/04/2024, 11:53 AM

Camilo Piñón

01/04/2024, 11:59 AM

Yes, my idea is that they have the same input but support for different intermediate datasets is permitted (some models will need different scaling/normalization techniques)

Camilo Piñón

01/04/2024, 11:59 AM

And no, I am not aware of namespace pipelines, but definitely will check it

Camilo Piñón

01/04/2024, 12:00 PM

Is there any place where the specific code for the demo is available?

Camilo Piñón

01/04/2024, 12:00 PM

Thank you so much for the insights Nok!

🥳 1

K 1

Nok Lam Chan

01/04/2024, 1:20 PM

https://github.com/kedro-org/kedro-viz/tree/main/demo-project You can find it here

K 1

👍 1

Open in Slack

Previous Next