Hi again everyone, I've another quick question. I'...
# questions
g
Hi again everyone, I've another quick question. I've a hook that runs before a specific node. This hook will check the data in one of the previous steps and determine what the correct value of a certain param should be for the upcoming node. To begin with, the new params are defaulted as empty in the
conf/base/parameters.yaml
file
Copy code
param_1: foo
param_2: bar 
my_nice_new_params:
If I have the following toy code block:
Copy code
@hook_impl
def before_node_run(..args.., catalog: DataCatalog, ..more_args..)
    
    print(catalog._get_dataset('params:my_nice_new_params'))
    new_param1, new_param2 = run_some_super_cool_logic()
    catalog.add_feed_dict(
    {
        'params:my_nice_new_params':[new_param1, new_param2]
    }, replace=True
    )
    print(catalog._get_dataset('params:my_nice_new_params'))
Then the printed stdout will be something like
Copy code
MemoryDataSet(data=<NoneType>)
MemoryDataSet(data=<list>)
which is what I would have hoped for. However, when the node itself is run and accesses the
'params:my_nice_new_params'
, the original
None
value remains. Is there a step that I'm missing that saves the most recent instance of the catalog?
a
The right way to do this would be to not try and change the parameters in the catalog but instead get the
before_node_run
hook to return a dictionary
{"params:my_nice_new_params": [new_params1, new_params2}
With that said, this feels like a bit of a weird thing to do: if the value of something depends on the output of a previous node, why not make a dataset for that node output and then use it as node input to the subsequent node? You can use e.g.
JSONDataSet
for this sort of thing if you want to persist it to disk.