Hey everyone I m new to Kedro and I first want to thank all Kedro #questions

Hey everyone! I’m new to Kedro, and I first want ...

Patrick Deutschmann

01/27/2023, 1:19 PM

Hey everyone! I’m new to Kedro, and I first want to thank all the contributors. You’ve genuinely built a fantastic tool! Is it possible to save outputs to multiple data sets? For instance, I’d like to write my feature data both to the local file system and to, say, an Azure blob storage. Thanks 😊

❤️ 3

FlorianGD

01/27/2023, 1:20 PM

Hello, I think the easiest is something along those lines:

catalog.yml

Copy code

data_local:
    type: pandas.CSVDataSet
    filepath: data/03_primary/data.csv

data_remote:
    type: pandas.CSVDataSet
    filepath: <s3://my-bucket/data/03_primary/data.csv>
    credentials: creds

and add a node like so in

pipeline.py

Copy code

def create_pipeline():
   return pipeline([..., # your pipeline start
        node(lambda df: (df, df), inputs="data_before_save", ouputs=["data_local", "data_remote"])])

👍🏾 1

👍 3

Deepyaman Datta

01/27/2023, 1:49 PM

What @FlorianGD suggests is good, because you can parallelize the write to

data_local

and

data_remote

by running the pipeline

--async

(reads inputs or writes outputs for a single node in parallel, using threads). If you do this a lot, your DAG starts looking a bit ugly, so it's possible to do this using hooks. The idea here would be that you don't care from a logic perspective that it's getting written to local and cloud storage, and that should be handled on the backend for specified nodes without making your DAG look different. Finally, if your requirement was a bit different (e.g. write everything to Azure blob storage bucket and replicate in another bucket), it's usually more efficient to do this outside of Kedro (i.e. run a process to copy the data after the pipeline runs).

Patrick Deutschmann

01/27/2023, 1:52 PM

@FlorianGD Your solution works like a charm, thanks 🤗 @Deepyaman Datta Great, I’ll try out

--async

and hooks as well.

3 Views

Open in Slack

Previous Next