https://kedro.org/ logo
#questions
Title
# questions
t

Toni

10/25/2022, 10:35 AM
Hi community! I was wondering if I can save an output of the same node in two different formats: For instance:
Copy code
node(
  func = some_function,
  inputs = "some_input",
  outputs = "the_output",
   name = "node",
),
Copy code
the_output:
  type: pandas.CSVDataSet
  filepath: data/output_csv.csv

the_output:
  type: pandas.ParquetDataSet
  filepath: data/output_parquet.parquet
t

Toni

10/25/2022, 10:39 AM
Thanks @Hamza. As I see, I have to indicate explicitly the format of the output; thus, it will not be saved with the two formats by default. Am I correct?
Copy code
node(
  func = some_function,
  inputs = "some_input",
  outputs = "the_output@csv" or "the_output@parquet",
   name = "node",
),
I would like to save the same pandas dataframe into an SQL database (using
SQLTableDataSet
) and into an S3 bucket (using the
CSVDataSet
, without having to create additional nodes.
f

FlorianGD

10/25/2022, 11:57 AM
You can change your node to output a tuple (of the same dataframe), and define the 2 entries in 2 different names
👍 2
t

Toni

10/25/2022, 12:00 PM
Thank you @FlorianGD. This was the workaround I had in mind... also, creating a node that simply outputs a copy of the input dataframe and whose output points towards another entry of the catalog... (for instance, using
output
and
output@parquet
)