Abhishek Bhatia
06/13/2023, 10:21 AMPartitionedDataSet
. In the below pipeline, I have a node which returns a dictionary with values as pandas dataframes, so I define a PartionedDataSet
catalog entry for it. If I run the nodes till only this node then the files do get saved in the correct location but the output is an empty dictionary. If I add an identity node, then the correct key-value pair is returned. Is this the desired behaviour?datajoely
06/13/2023, 10:36 AMdef
with a debugger it will be more intuitive. But essentially df
in this situation is a dictionary of key : lazy loader pairs, actual_key_value_pair_part_ds_output
needs to be a PartitionedDataSet
too to save like this, or more logic is required to handle in a graceful way.Abhishek Bhatia
06/13/2023, 10:37 AMdatajoely
06/13/2023, 10:40 AMa_node_that_creates_a_part_dataset
?Abhishek Bhatia
06/13/2023, 12:16 PMdef a_node_that_creates_a_part_dataset(**kwargs):
return {'key1': df1, 'key2': df2, 'key3': df3}
datajoely
06/13/2023, 12:43 PMdef
and check with a debugger{'key1': df1.load(), 'key2': df2.load(), 'key3': df3.load()}
to be passed in