Hi all what is best practice when performing inference in ke Kedro #questions

Hi all, what is best practice when performing infe...

Rob McInerney

08/16/2024, 3:56 PM

Hi all, what is best practice when performing inference in kedro, when inference input data requires the same pre-processing pipeline steps as a training pipeline? I want to reuse the same pre-processing steps for both training and inference, however I cannot find any documentation on how to do this. Ultimately I’d like to package the entire model, including pre-processing and inference. Any guidance would be helpful

Deepyaman Datta

08/16/2024, 6:15 PM

Something like a scikit-learn is probably more appropriate for this. You can then call the
train
and
predict
from within nodes. Unfortunately, while it's been explored a number of times, I don't know if there's a great way to create an equivalency between Kedro pipelines + nodes and scikit-learn pipelines + transformers, the way things currently stand Removing my answer in favor of @Yolan Honoré-Rougé’s :)

Yolan Honoré-Rougé

08/16/2024, 6:24 PM

This tries to answer this very question : https://github.com/Galileo-Galilei/kedro-mlflow-tutorial There is a focus on mlflow, but you can ignore it and adapt to another tool if you want.

Deepyaman Datta

08/17/2024, 2:42 AM

This

pipeline_ml_factory

concept is very interesting! I never know of it, thanks. I will try it out soon.

Yolan Honoré-Rougé

08/17/2024, 4:06 PM

I don't know how to make it more discoverable. The goal is exactly to create a scikit-learn like pipeline by "binding" two kedro pipelines to have extra flexibility. Many people who discovered this tutorial enjoy it a lot but they often discover it through me, I'd love to find a way to make it more "googlable"

Deepyaman Datta

08/17/2024, 6:46 PM

I've been giving this canned answer for a while now: https://kedro-org.slack.com/archives/C03QF15L1K9/p1720307969353049?thread_ts=1720246714.100739&cid=C03QF15L1K9 I'll make sure to update. A blog article, or even inclusion in Kedro FAQ, could be great; I really think this comes up often.

👍 1

Yolan Honoré-Rougé

08/18/2024, 11:21 AM

The tutorial is already mentioned in above thread ^^' Maybe I should rebrand it as" scikit-learn like pipelines" to make the concept easier to grasp.

😅 1

Deepyaman Datta

08/18/2024, 4:10 PM

Oh, it was three weeks after. Even though I link to it, I never reread the thread! 😂

😂 1

Rob McInerney

08/19/2024, 9:32 AM

Thanks for the answers @Yolan Honoré-Rougé and @Deepyaman Datta. I actually found the kedro-mlflow tutorial after I posted my original question and have been working through it. I’m really surprised this question doesn’t come up more often though. I would have thought that training and inference pipelines would be a really really common use case with Kedro. Am I missing something?

8 Views

Open in Slack

Previous Next