Rachid Cherqaoui
06/30/2025, 9:11 AM.txt
file generated by a Kedro pipeline that I created, and I'd like to send it to a folder on a remote server via SFTP.
After several attempts, I found it quite tricky to handle this cleanly within Kedro, especially while keeping things consistent with its data catalog and hooks system.
Would anyone be able to help or share best practices on how to achieve this with Kedro?
Thanks in advance for your support!Rachid Cherqaoui
06/30/2025, 9:41 AMJitendra Gundaniya
06/30/2025, 9:49 AMRachid Cherqaoui
06/30/2025, 9:52 AMJitendra Gundaniya
06/30/2025, 9:59 AMafter_dataset_saved
hook to upload files through SFTP.
Please checkout hook docs.Rachid Cherqaoui
06/30/2025, 12:49 PMafter_dataset_saved
hook, but I haven't managed to get it working properly, especially since my dataset is a plain .txt
file (a TextDataset
). I'm a bit unsure how to extract the right file path (with versioning), and how to connect that to the SFTP upload.
Would it be possible for you to share a concrete example of how to use after_dataset_saved
to upload a versioned .txt
file to an SFTP server? That would help a lot.
Thanks in advance!Sajid Alam
06/30/2025, 2:22 PMdef after_dataset_saved(self, dataset_name: str, catalog: DataCatalog) -> None:
datasets_to_upload = ["your_text_output"] # Replace with your own
if dataset_name not in datasets_to_upload: # Some logic to decide which to upload
return
# Get the dataset from the catalog
dataset = catalog._datasets[dataset_name]
# For versioned datasets
if hasattr(dataset, '_version') and dataset._version:
load_version = dataset.resolve_load_version()
# Construct the full file path with version
if hasattr(dataset, '_filepath'):
base_path = Path(dataset._filepath)
versioned_path = base_path.parent / load_version / base_path.name
local_file_path = versioned_path
else:
# For non-versioned datasets
local_file_path = Path(dataset._filepath)
# Upload to SFTP Example
self._upload_to_sftp(local_file_path, dataset_name)