Afiq Johari10/25/2023, 10:55 AM
Unfortunately, recreating the data transformation of the stored procedure in Python may not be straightforward. That's why I depend on the stored procedure to transform/update my data.
node(func=regenerate, inputs="mydata_sql",outputs="mydata_excel") def regenerate(mydata): # run SQL stored procedure that impacts the table of mydata_sql in the database # reload mydata because the stored procedure will have changed mydata return mydata # convert it to excel file
Dmitry Sorokin10/25/2023, 1:07 PM
Afiq Johari10/25/2023, 3:06 PM
Nok Lam Chan10/25/2023, 7:05 PM
but you need to make sure the store_proc get executed?
and does it take any
? If not - I think
is a good candidate https://docs.kedro.org/en/stable/kedro.framework.hooks.specs.DatasetSpecs.html#kedro.framework.hooks.specs.DatasetSpecs.before_dataset_loaded If yes - It’s a bit tricky because it’s not pure I/O but it is also not processing logic, the real compute happens in the database and your code only trigger SP and load the data (which is more a responsibility of dataset). You will most likely need some “dummy input/output” instead case to make sure the dependency is correct.
Afiq Johari10/26/2023, 2:40 AM
are actually for the
(SP) At the moment, there's no plan to migrate the SP to Python, hence why we rely on the SP. The latest code iteration I have is to reload the data within the node Added this to
credentials = config_loader.get("credentials.yml") catalogs = config_loader.get("catalog.yml") thedata= DataCatalog.from_config(catalogs, credentials)
def regenerate(mydata): # run SP # reload sql data after SP execution mydata= thedata.load("mydata_sql") return mydata