Kedro - How to update a dataset in a Kedro pipeline given that a dataset cannot be both input and output of a node (only DAG)?
In a Kedro project, I have a dataset in catalog.yml that I need to increment by adding a few lines each time I call my pipeline.
#catalog.yml
my_main_dataset:
type: pandas.SQLTableDataSet
credentials: postgrey_credentials
save_args:
if_exists: append
table_name: my_dataset_name
However I cannot just rely on append in my catalog parameters since I need to control that I do not insert already existing dates in my dataset to avoid duplicates.
I also cannot create a node taking my...