Hugo Evers09/19/2023, 1:50 PM
This allows one to use any arbitrary catalog entry (so you store your model as a pickle on s3, or in mlflow), however integration with KedroPipeline seems very complicated as bento needs to be aware of the model framework. In additation, CI/CD now needs access the kedro context, while id prefer to simply link the mlflow storage to bentoml and maybe use some adapters for pre and post-processing. Additionaly, Bento uses its own storage location, which for as far as i know i can;t move to the cloud (which is quite catastrophic when using Llama=70b, since it will instantly fill up your local storage), have you found any ways around this? Do you make storing bentos part of your kedro pipelines? in the above code, i look in the bento storage, and if not found in the kedro catalog. But this obvisously only works when you actively manage this (otherwise you’d be pulling old models). any thoughts or preferences? Also, do you use pip, conda or poetry? im looking to use different dependancy groups to separate model deps from dev/training deps. also because there have been quite a few breaking changes lately when upgrading packages. Is there any special tricks you employ wrt staggered updating of deps combined with tests? Do you link the poetry deps with mlflow, or do you use bentos inver_packages? Also, what are your opinions when it comes to deploying the packaged models to k8s? Do you simply deploy docker containers direcly, or use something like Seldon or Kserve? Or even Bento’s Yatai? Im curious!
import bentoml from pathlib import Path def _find_kedro_project(current_dir): # pragma: no cover from kedro.framework.startup import _is_project while current_dir != current_dir.parent: if _is_project(current_dir): return current_dir current_dir = current_dir.parent return None def retrieve_kedro_context(env="local"): from kedro.framework.session import KedroSession from kedro.framework.startup import bootstrap_project project_path = _find_kedro_project(Path.cwd()) metadata = bootstrap_project(project_path) with KedroSession.create( package_name=metadata.package_name, project_path=metadata.project_path, env=env, ) as kedro_session: return kedro_session.load_context() def download_model(name: str) -> bentoml.Model: try: return bentoml.transformers.get(name) except bentoml.exceptions.NotFound: catalog = retrieve_kedro_context().catalog pipeline = catalog.load(name) return bentoml.transformers.save_model(name, pipeline) def get_runner(name: str, init_local: bool = False): runner = download_model(name).to_runner() if init_local: runner.init_local(quiet=True) return runner
Florian d09/19/2023, 2:53 PM
datajoely09/19/2023, 4:52 PM
marrrcin09/20/2023, 7:09 AM
tionts as Kedro nodes had. Worked OK.
Hugo Evers09/20/2023, 11:04 AM
marrrcin09/20/2023, 11:40 AM
class, but I’m not sure how BentoML looks right now, I think there were some breaking changes since last time I’ve used it for this case.
Hugo Evers09/20/2023, 11:42 AM
Juan Luis09/21/2023, 10:34 AM