https://kedro.org/ logo
#plugins-integrations
Title
# plugins-integrations
w

William Caicedo

11/21/2023, 11:11 AM
Would it be possible to do something like
Copy code
with KedroSession.create(...) as session:
  session.run(SageMakerPipelinesRunner())
and run a pipeline with
kedro-sagemaker
from inside a Jupyter notebook?
d

datajoely

11/21/2023, 11:14 AM
is this a sagemaker notebook or a local jupyter?
w

William Caicedo

11/21/2023, 11:14 AM
Local jupyter
d

datajoely

11/21/2023, 11:15 AM
so not really - I think you could write a custom runner to maybe do this
👍 1
w

William Caicedo

11/21/2023, 11:16 AM
Yeah I’m using
kedro sagemaker run
from the CLI
It’s just that I have a script that launches multiple Kedro sessions locally using the the
with …
approach and wanted to check if it was possible for the script to launch the pipelines in sagemaker instead, using the plugin runner instead of the default runner
d

datajoely

11/21/2023, 11:20 AM
@marrrcin any ideas here?
m

marrrcin

11/21/2023, 11:29 AM
If you’re using jupyter you could just do:
Copy code
for i in range(10):
    python_variable = f"pipeline_to_run_{i}"
    !kedro sagemaker run --pipeline={python_variable}
as a hack
Copy code
with KedroSession.create(...) as session:
  session.run(SageMakerPipelinesRunner())
this approach will not work, since Sagemaker needs a docker image to run your code. The idea behind the plugin is to stick to local execution (just using
kedro run
) as long as possible and then once the pipeline is implemented, run it on sagemaker. SageMaker itself is not the greatest piece of service IMHO.
👍 1
w

William Caicedo

11/21/2023, 11:37 AM
Agree that SageMaker pipelines don’t offer a great user experience. All development occurs in my local machine or in a dev container living in an EC2 instance, and I’m exploring SageMaker pipelines to run multiple copies of the same pipeline at scale. The docker image with my kedro project already lives in ECR and I’m able to run it with no issues. What I’m trying to establish now is that I can use the same approach I use locally to run multiple pipelines programmatically, to trigger the runs in SageMaker (i.e. not building and pushing a docker image to ECR but just doing a
kedro sagemaker run
). If this is not possible, I guess I can launch a `subprocess.run(["kedro","sagemaker","run"])`and somehow pass different parameters to each run? Or maybe there’s a whole different approach that works best in this case
m

marrrcin

11/21/2023, 1:34 PM
1. When you change the code, you have to re-build and push the docker image to ECR in order for SageMaker to have your latest code (that’s why we’ve added the
--auto-build
flag to
kedro sagemake run
). 2. If you want to run the same pipeline multiple times with different parameters in SageMaker, you can use
kedro sagemaker run --params='<params JSON>'
. Note that this
--params
is different from
kedro run --params
, because we DO support dicts/lists etc., so you can override any parameter.
👍 1
n

Nok Lam Chan

11/21/2023, 4:55 PM
If you want to run the same pipeline multiple times with different parameters in SageMaker, you can use
kedro sagemaker run --params='<params JSON>'
. Note that this
--params
is different from
kedro run --params
, because we DO support dicts/lists etc., so you can override any parameter.
Is it the same syntax
kedro run
use? Is it possible to add this back to
kedro
?
w

William Caicedo

11/21/2023, 5:59 PM
Thank you @marrrcin - very helpful. I can’t use
auto-build
cause I use Apple silicon and need to specify a target architecture for the image. I might try my hand at contributing such option to the plugin. Also, regarding #2, is it possible to specify tags/pipeline as well?
m

marrrcin

11/21/2023, 9:22 PM
@William Caicedo pipeline - yes; tags - no (as for now)
👍 1
3 Views