Hi everyone! In my team we started to use Kedro re...
# questions
Hi everyone! In my team we started to use Kedro recently for data science projects, we have found many advantages, and we are very happy with it. Now we are facing some challenges regarding the implementation of the models in Google Cloud and Vertex AI. I woud really appreciate you opinion about these points: 1. We want to apply the data transformation steps to the new data (eg. one-hot encoding, standardization, missing imputation, etc) when the model is used for prediction. We know that with scikit-learn pipelines we can do that, but there are many disadvantages which were discussed in this thread. There, some of you recommended the
plugin to achieve what we want. Here are the questions: Once you have the mlflow artifact can we still use the kedro-docker plugin to create the image or do we have to create the Docker image from scratch? From the other hand, can we still use the other plugins to export the pipeline to Airflow or Vertex Pipelines? 2. On that basis, we start to question if is it better to use mlflow for tracking and model registry taking advantage of the Kedro plugins, than the Vertex AI APIs. I would like to know your opinion about this or recommendations about how to combine both worlds. Thanks in advance. #questions #plugins-integrations
So the plugins you mention are from different sources, but I can explain how they’ve come about: •
is maintained by the core team, but is really designed for people who don’t know where to start with docker and give them something to get going •
is a community driven plugin which our users really love, but I’m not sure if any thing changes when containerised •
is another community driven plugin which apparently has great integration with the
plugin above All in all how you package or orchestrate your kedro pipelines isn’t something we’re super opinionated on. In terms of philosophy I like the view presented here


K 1
👍 1