Hi Team I have a basic doubt about using `PartitionedDataSet Kedro #questions

Hi Team, I have a basic doubt about using `Partiti...

Abhishek Bhatia

06/13/2023, 10:21 AM

Hi Team, I have a basic doubt about using

PartitionedDataSet

. In the below pipeline, I have a node which returns a dictionary with values as pandas dataframes, so I define a

PartionedDataSet

catalog entry for it. If I run the nodes till only this node then the files do get saved in the correct location but the output is an empty dictionary. If I add an identity node, then the correct key-value pair is returned. Is this the desired behaviour?

datajoely

06/13/2023, 10:36 AM

doing a lambda here is a bit confusing if you do a

def

with a debugger it will be more intuitive. But essentially

df

in this situation is a dictionary of key : lazy loader pairs,

actual_key_value_pair_part_ds_output

needs to be a

PartitionedDataSet

too to save like this, or more logic is required to handle in a graceful way.

Abhishek Bhatia

06/13/2023, 10:37 AM

@datajoely I am not actually using the 2nd node. It is just that the first node although writes the partitioned dataset correctly, the node itself returns an empty dictionary.

Abhishek Bhatia

06/13/2023, 10:38 AM

Just need to confirm if this is the expected behaviour of not returning a dictionary of lazy loaders in the first node as well.

datajoely

06/13/2023, 10:40 AM

can I see the function

a_node_that_creates_a_part_dataset

Abhishek Bhatia

06/13/2023, 12:16 PM

Basically looks like this

Copy code

def a_node_that_creates_a_part_dataset(**kwargs):
    return {'key1': df1, 'key2': df2, 'key3': df3}

Abhishek Bhatia

06/13/2023, 12:17 PM

the outputs can have any number of keys

datajoely

06/13/2023, 12:43 PM

yes this doesn’t look right

datajoely

06/13/2023, 12:44 PM

I would change the second function to a

def

and check with a debugger

datajoely

06/13/2023, 12:44 PM

I would expect

{'key1': df1.load(), 'key2': df2.load(), 'key3': df3.load()}

to be passed in

👍🏼 1

4 Views

Open in Slack

Previous Next