Olivia Lihn
02/13/2023, 1:49 PMdef before_pipeline_run(self, run_params, catalog: DataCatalog) -> None:
"""Change feature inclusion parameters for
scoring pipeline
"""
if run_params["pipeline_name"] == "scoring":
# retrieve feature_list from catalog
feature_list_df = catalog.load("modeling.feature_selection_report")
feature_list = list(feature_list_df[feature_list_df.selected == True].feature.unique())
# get list of feature engineering pipelines
params = catalog.load("parameters")
feateng_pipes = [fteng_name for fteng_name in params.keys() if fteng_name.endswith("_fteng")]
# overwrite parameters
for pipeline in feateng_pipes:
catalog.add_all(
{f"params:{pipeline}.feature_inclusion_params.feature_list": feature_list,
f"params:{pipeline}.feature_inclusion_params.enable_regex": True},
replace=True
)
I also tried using run_params["params"]
without any luck, and tried returning the catalog but no luck. The hook runs (tested with print statements), so my guess is i'm missing something. Thanks!marrrcin
02/13/2023, 2:06 PMcatalog.add
/ catalog.add_all
only adds AbstractDataSet
entries to the catalog, without saving them, my guess is that you should have sth like this:
catalog.add_all(
{f"params:{pipeline}.feature_inclusion_params.feature_list": MemoryDataSet(feature_list),
f"params:{pipeline}.feature_inclusion_params.enable_regex": MemoryDataSet(True)},
replace=True
)
Olivia Lihn
02/13/2023, 2:06 PMmarrrcin
02/13/2023, 2:08 PMcatalog.add_feed_dict
instead <-- this is what Kedro actually does to add parameters to the catalogOlivia Lihn
02/13/2023, 2:08 PMmarrrcin
02/13/2023, 2:11 PMOlivia Lihn
02/13/2023, 2:12 PMmarrrcin
02/13/2023, 2:17 PMparameters
or your specific keys e.g. f"params:{pipeline}.feature_inclusion_params.feature_list"
?parameters
and specific keys if you want to use both in kedro nodes or just stick to one typeparameters
key as well as for every nested object, with a params:
prefix. If you overwrite only params:you_long_key
in the hook, then only nodes consuming the input in a form of params:your_long_key
will get the modified value. If you use parameters
in your nodes, then you need to overwrite whole parameters
dict in your hook.Olivia Lihn
02/13/2023, 2:26 PMparams:<pipeline_name>
) Thanks so much!Nok Lam Chan
02/13/2023, 2:39 PMdatajoely
02/13/2023, 4:47 PM