https://kedro.org/ logo
#questions
Title
# questions
s

Serge Louvet

09/27/2023, 10:34 AM
Short question: I've a function producing a dictionary expected to be saved as a yaml file. I'm getting the right results when I yaml.dump() the dictionary. This function is now a node function. Is it enough to add micro_segmentation.reporting.tree_node_yaml: type: yaml.YAMLDataSet filepath: data/demo/08_reporting/micro_segmentation/tree_definition.yaml This is how the function is defined in the pipeline: node( func=convert_tree_node_table_to_yaml, inputs={"df": "micro_segmentation.reporting.tree_node_table"}, outputs="micro_segmentation.reporting.tree_node_yaml", name="micro_segmentation.assessment.convert_tree_node_table_to_yaml", tags=["micro_segmentation.assessment"],
d

datajoely

09/27/2023, 10:35 AM
I hate does the dictionary returned by the dictionary look like? Can you provide a stack trace?
s

Serge Louvet

09/27/2023, 10:40 AM
Note that when I yaml.dump() the dict returned by the function, I'm getting the expected yaml file. Can run the function but cannot run the pipeline from here, but I can share the expected dict used in the test case.
expected_dict.txt
n

Nok Lam Chan

09/27/2023, 11:23 AM
What’s the error/problem?
d

datajoely

09/27/2023, 12:04 PM
“interval”: “(27.5,inf)“,
this is the issue
inf
is not JSON serialisable
s

Serge Louvet

09/27/2023, 12:17 PM
No error(s). Context: I have a function generating a dictionary. When I yaml.dump() that dictionary, I'm getting the yaml file I need. The function is now a node function in a pipeline, as follow: func=convert_tree_node_table_to_yaml, inputs={"df": "micro_segmentation.reporting.tree_node_table"}, outputs="micro_segmentation.reporting.tree_node_yaml", name="micro_segmentation.assessment.convert_tree_node_table_to_yaml", I've added the following in the catalog: micro_segmentation.reporting.tree_node_yaml: type: yaml.YAMLDataSet filepath: data/demo/08_reporting/micro_segmentation/tree_definition.yaml The question: is this enough to get the dictionary saved as a yaml file ?
The value of "interval" is just a string.
d

datajoely

09/27/2023, 12:19 PM
we’re doing the same thing behind the scenes, the only difference is abstracting the filesystem so it works on cloud storage
in a notebook you can import the YAML class directly and try getting it to work
do you get an error?
s

Serge Louvet

09/27/2023, 12:21 PM
Ok. Good. So this should work. I cannot run the full pipeline here.
d
s

Serge Louvet

09/27/2023, 12:21 PM
Thanks
d

datajoely

09/27/2023, 12:21 PM
accept a Python dict and pass to
yaml.dump
2 Views