Hi everyone :wave: I'm running into an issue and w...
# questions
r
Hi everyone 👋 I'm running into an issue and would appreciate your insights. I'm getting the following error: However, the file mentioned doesn't exist by design — my code is supposed to create it later only if there's data to write. In some cases, the process can return an empty DataFrame, so there's simply nothing to save. Have you encountered this kind of situation before? Do you have any suggestions on how to properly handle this case, either by checking upstream or conditionally skipping the writing step? Thanks a lot in advance! 🙏
e
Hi @Rachid Cherqaoui, it looks like the error in your error is caused by overwriting a versioned dataset. Try clearing outputs and running your pipeline again. As for dealing with the conditional output - we do not recommend this approach in general and do not have a built in functionality for this but you can subclass an existing dataset and override
save()
to skip writing if the input is empty or
None
.
r
Thanks for your reply! I made sure there's no overwrite happening, as the file doesn't exist to begin with the directory is empty before running Kedro, and I haven't executed this pipeline before. Still, I'm getting this error during the run, even though the file isn't there yet. Could you explain a bit more, or share an example of how I could work around this? Ideally, I’d like to persist even an empty DataFrame, with just the column names (like in the screenshot), because it helps keep a trace of what files were generated, even if no data was found. Thanks again!
e
Could you please share a minimal example of your code, so I could debug on my side?
r
I just found the issue. Even though the error message didn’t point to the real problem, I tried renaming my file with a shorter name — and it worked perfectly. So the actual problem was due to the file name being too long and hitting the system's file name length limit. Thanks ^^
👍 1