Hello everyone.. i am facing a bit of challenge wh...
# questions
a
Hello everyone.. i am facing a bit of challenge while using kedro.. i don't know whether there is an obvious solution that i am missing or not. I want to save a file to s3 bucket and that file is used by next node. But i don't want the next node to read from the s3 bucket rather use the output from the last node directly. can i achieve this while using data catalog
d
You need to essentially create two outputs--one which writes to the dataset pointing to an S3 buckets, and another to a
MemoryDataset
. The
MemoryDataset
should then be consumed by the next node. There are also more sophisticated ways to do this; e.g. https://github.com/deepyaman/kedro-accelerator (for inspiration, not currently maintained/not compatible out of the box with Kedro 0.18.x AFAIK).
a
Thank you for the response.. i did exactly that but the challenge is if i want to run the pipeline from the next node then i cannot... or can i?
l
Is this not what the
CachedDataSet
does? 🤔 Haven't used it myself, but this seems like the right usecase https://docs.kedro.org/en/stable/kedro.io.CachedDataset.html
🏆 3
d
Ah, nice, I kinda forgot how it worked. Yes,
CachedDataset
is the way to go for this specifically. Thanks @Lodewic van Twillert!!
👍 2