Anyone dealt with defining datasets for blobs like model arc Kedro #questions

Anyone dealt with defining datasets for blobs like...

Ed Henry

08/09/2023, 5:43 PM

Anyone dealt with defining datasets for blobs like model archives for TorchServe? Pretty straight forward use case, but was wondering if anyone else had tried anything 🙂

datajoely

08/09/2023, 6:02 PM

Do the different pickle options available not work?

Ed Henry

08/09/2023, 6:05 PM

Unfortunately not - TorchServe expects a model archive that contains various model artifacts and a customer_handler.py that handles loading the model, sticking it behind an API, and routing. https://github.com/pytorch/serve/blob/master/model-archiver/README.md I'll probably just create a simple Dataset that uploads / downloads the model-archive to the local FS and go from there. Just wanted to see what others might have come up with

datajoely

08/09/2023, 6:06 PM

I guess you could also subclass/work off

APIDataSet

to do the saving to the endpoint that way

👍 1

Ed Henry

08/09/2023, 6:10 PM

That's a great idea! We deploy TorchServe instances per model given the types of models and pipelines we have but this might be useful in the future!

datajoely

08/09/2023, 6:10 PM

Out of interest (this is something tangental I’m working on) would you like it if your Kedro pipeline was a service that also lived as an endpoint?

Ed Henry

08/09/2023, 6:11 PM

It would certainly make deployment of models more seamless

K 1

Ed Henry

08/09/2023, 6:12 PM

Right now I have a node that calls TorchServe through

os.command

which is hacky but works. Something more native would be really cool

datajoely

08/09/2023, 6:14 PM

super interesting thank you

👍 1

8 Views

Open in Slack

Previous Next