Hi all, While saving a spark parquet versioned da...
# questions
Hi all, While saving a spark parquet versioned dataset i am hitting the following error,
Copy code
VersionNotFoundError: Did not find any versions for SparkDataset(file_format=parquet, filepath=<s3://bucket/folder/file_name>, 
load_args={'header': True}, save_args={'mode': overwrite}, version=Version(load=None, save='2023-12-06T13.06.41.920Z'))
config file:
Copy code
_pq: &_pq
  type: spark.SparkDataSet
  file_format: parquet
  versioned: True
    header: True
    mode: overwrite

  filepath: ${base}/model_input/master
  <<: *_pq
Copy code
kedro                            0.18.14
kedro-datasets                   1.8.0
Kedro Versions, What Am I doing wrong here
Your filepaths in logs and catalog entry aren't matching in your sanitized example, so hard to tell. Also, can you share what files are available under the path? If I had to guess with this incomplete information, I think you saved a non-versioned dataset before, didn't clean it up, and are now trying to save a versioned dataset, and Kedro is getting confused looking at the existing file structure. 🙂
file paths I changed due to privacy issue,
let me check the path
file paths I changed due to privacy issue,
Yes, but they're not matching. 🙂 If you can sanitize it in a way that's still analogous to the actual structure, that would be helpful, because the issue is quite likely related to the file paths/existing files at that path.
Same problem 👆🏻
@marrrcin I'm not confident on that. My memory is hazy, but I think we used to version all our
instances on projects, back when I used it years ago. Not sure if something's fundamentally changed. To double checked, I also looked through the issue tracker, as well as seeing tests for versioning on different filesystems in https://github.com/kedro-org/kedro-plugins/blob/main/kedro-datasets/tests/spark/test_spark_dataset.py. So, I personally do think it should work. 🙂
@datajoely how is it then?
I’m not entirely use - I thought it wasn’t compatible
😬 1