Hi Everyone, I am running a Kedro Pipeline on AWS ...
# questions
j
Hi Everyone, I am running a Kedro Pipeline on AWS Step Functions with Lambda. I use S3 as for the data. Everything works fine. However whenever I add torch
Copy code
torch==2.0.1+cpu -f <https://download.pytorch.org/whl/torch_stable.html>
torchvision==0.15.2+cpu -f <https://download.pytorch.org/whl/torch_stable.html>
the lambda is not able to access s3 and fails with
Install s3fs to access S3
. If I install everything locally on my linux and run
Kedro run
it runs fine. Anyone came across this problem or has an idea on how to fix it?
j
torch is famous for requiring lots of RAM to install https://github.com/pypa/pip/issues/9678 I'm not sure if that's the problem, but worth looking into it
j
Thanks for your awnser 🙂 I am deploying the pipeline as docker image, so the libraries are already installed. Also my lambda only uses 150/1024 memory max, so this does not seem to be the problem I guess.
j
okay, after re-reading your question I think I understand better. it's weird that installing torch has any effect, can you verify that https://pypi.org/project/s3fs/ is listed in your requirements and installed?
j
yes, it is installed. I even tried installing it as a last build step in my Docker image
j
does it work if you leave Kedro out of the equation? e.g. by trying to access an S3 object directly. it would be important to try to narrow down the issue