João Dias
02/18/2025, 12:34 PMtensorflow.TensorFlowModelDataset
with an S3 bucket. The model saves fine locally, but when I configure it to save/load directly from S3, it doesn't work.
Some key points:
• Credentials are fine – I can load other datasets (preprocessing outputs and split data) from S3 without issues.
• Uploading manually works – If I explicitly upload the model file using boto3
or another script, I can access it in S3 just fine.
• Had issues with .h5
models – Initially, I could retrieve .h5
files from S3 but loading was not working properly, so I switched to the .keras
format, which works fine when handling files manually.
Has anyone successfully used tensorflow.TensorFlowModelDataset
with S3? Is there a recommended workaround or configuration to get it working? Any insights would be much appreciated!
To make it clearer: I am only having problems when the node output - the model - is pointed to s3. I am getting access denied even after checking credentials , IAM policies, and testing with manual scriptHall
02/18/2025, 12:34 PMElena Khaustova
02/18/2025, 2:25 PMI am only having problems when the node output - the model - is pointed to s3.
João Dias
02/18/2025, 2:33 PMElena Khaustova
02/18/2025, 3:34 PMTensorFlowModelDataset
first writes the model to a local temporary directory.
``````Elena Khaustova
02/18/2025, 3:34 PMdef save(self, data: tf.keras.Model) -> None:
save_path = get_filepath_str(self._get_save_path(), self._protocol)
with tempfile.TemporaryDirectory(prefix=self._tmp_prefix) as tempdir:
if self._is_h5:
path = str(PurePath(tempdir) / TEMPORARY_H5_FILE) # noqa: PLW2901
else:
# We assume .keras
path = str(PurePath(tempdir) / TEMPORARY_KERAS_FILE) # noqa: PLW2901
tf.keras.models.save_model(data, path, **self._save_args)
# Use fsspec to take from local tempfile directory/file and
# put in ArbitraryFileSystem
self._fs.put(path, save_path)
Elena Khaustova
02/18/2025, 3:35 PMsave_path
is what you expect it to be and that local copy of model is created on save?João Dias
02/18/2025, 4:24 PM.keras
directly to S3, but I'm getting an "Access Denied" error.
• Saving both .h5
and .keras
locally as node outputs works fine, and the file
command confirms they are correctly saved.
• I was able to manually upload both files to S3, download them, and load them successfully in TensorFlow.
• I can read and write other datasets to the same S3 bucket without issues, including a .pkl
scaler in the same directory where I want to store the models.
• However, when modifying catalog.yml
to save .keras
directly to S3, Kedro throws "Access Denied," even though .h5
and other datasets uploaded fine through a script and also other datasets were saved in the same bucket during preprocessing.
• IAM permissions should not be the issue since other S3 writes work.
I assume this setup is also correct:
lstm_model:
type: tensorflow.TensorFlowModelDataset
filepath: <s3://my-bucket/data/06_models/lstm_model.keras>
credentials: dev_s3
Elena Khaustova
02/18/2025, 4:35 PMJoão Dias
02/18/2025, 4:36 PM