about <@U04QZ1Y9VDM> question on inference: upon r...
# plugins-integrations
about @Hugo Evers question on inference: upon reading Hopsworks FTI architecture these days I was thinking that, if the output of a Kedro training pipeline is a model that can be serialized (ONNX or your format of choice), and also given that latency might be sensitive in some cases, is it worth doing the inference from Kedro? this might seem like I don't want you to use Kedro for inference pipelines while of course I want you to use Kedro for everything, but I'm seeking to understand how others do it.
Thanks for chiming in! what im landing on right now is saving the trained the inference pipeline using kedro-mlflow and converting that to the standard bento format. Basically everything that requires a catalog entry being loaded should be converted to a runner such that it gets stored in the bento storage and packaged with the model. The only issue im left with is the module structure of the bento and the import of kedro nodes, this is quite poorly documented. However the result of the bento build is quite nice, standardised and very small. I do poetry export to get a requirements file, i dont know whether building the deps with micromamba is possible without changing the defaiut bento docker file, but it could be worth a shot!
fortunately the tests for bentoml are quite good, they allow you to call some of the cli methods direcly from python. so im thinking we could test whether an inference pipeline is ready for production
do you know whether its possible to use an mlflow hook such that when i tag an mlflow model, it launches some ci/cd in github? because that would be quite conveniant actually