Hi team I need advice on using `ONNX` files and uploading th Kedro #questions

Hi team! I need advice on using `ONNX` files and u...

Georgi Iliev

06/16/2023, 7:56 AM

Hi team! I need advice on using

ONNX

files and uploading them to S3 automatically using "only" the catalog definition. Broadly speaking, the main flow of what we're trying to build is the following: 1. There is a process that trains and creates some files (PCA, scaler, some K-Means models, etc.) and saves them as

Pickle

to use them between different nodes. 2. Once the main

pipeline

is done, we're ready to distribute the model to our services. 3. We're using

ONNX

because our services are not built in Python and the ONNX libraries we use are a bit faster. 4. So taking this into account, we have a

publish

pipeline now that takes this

Picke

files, converts them to

ONNX

using

convert_sklearn

, and then uploads to S3. So, my main question here is: Is there a way to implement this so the transformation and the S3 upload is done automatically? • I know that we can specify a S3 path in the catalog, but I didn't see how to set the

.onnx

file type.

K 3

Juan Luis

06/16/2023, 8:03 AM

hi @Georgi Iliev! there's a kedro-onnx community plugin https://github.com/nickolasrm/kedro-onnx created by @Nickolas da Rocha Machado, @Melle van der Linde tried it a couple of months ago and reported that it still works nicely https://kedro-org.slack.com/archives/C03RKPCLYGY/p1682073587640849?thread_ts=1682070059.869009&cid=C03RKPCLYGY the catalog entry would supposedly look like this:

Copy code

# conf/base/catalog.yml
regressor:
  type: kedro_onnx.io.OnnxDataSet
  filepath: <s3://data/06_models/reg.onnx>
  backend: sklearn

(adapted from https://kedro-onnx.readthedocs.io/en/latest/usage.html) let me know if that works for you!

Georgi Iliev

06/16/2023, 8:11 AM

TYVM! I'll test it and let you know!

🙌🏼 1

4 Views

Open in Slack

Previous Next