tom kurian
12/06/2023, 2:07 PMVersionNotFoundError: Did not find any versions for SparkDataset(file_format=parquet, filepath=<s3://bucket/folder/file_name>,
load_args={'header': True}, save_args={'mode': overwrite}, version=Version(load=None, save='2023-12-06T13.06.41.920Z'))
config file:
_pq: &_pq
type: spark.SparkDataSet
file_format: parquet
versioned: True
load_args:
header: True
save_args:
mode: overwrite
model_input.narrow_master.narrow_master:
filepath: ${base}/model_input/master
<<: *_pq
kedro 0.18.14
kedro-datasets 1.8.0
Kedro Versions,
What Am I doing wrong hereDeepyaman Datta
12/06/2023, 2:11 PMtom kurian
12/06/2023, 2:13 PMtom kurian
12/06/2023, 2:13 PMDeepyaman Datta
12/06/2023, 2:15 PMfile paths I changed due to privacy issue,Yes, but they're not matching. 🙂 If you can sanitize it in a way that's still analogous to the actual structure, that would be helpful, because the issue is quite likely related to the file paths/existing files at that path.
marrrcin
12/06/2023, 2:42 PMDeepyaman Datta
12/06/2023, 3:05 PMSparkDataset
instances on projects, back when I used it years ago. Not sure if something's fundamentally changed.
To double checked, I also looked through the issue tracker, as well as seeing tests for versioning on different filesystems in https://github.com/kedro-org/kedro-plugins/blob/main/kedro-datasets/tests/spark/test_spark_dataset.py.
So, I personally do think it should work. 🙂marrrcin
12/06/2023, 9:07 PMdatajoely
12/06/2023, 10:24 PM