Hi everyone! If i would like to use deltatables fo...
# questions
Hi everyone! If i would like to use deltatables for update, delete or merge, should i do that inside the node? Or there is something that i can use for this goal using only catalog entries?
Hey @MarioFeynman we actually have some docs on this https://kedro.readthedocs.io/en/stable/tools_integration/pyspark.html#spark-and-delta-lake-interaction The UD of CRUD doesn’t really fit into Kedro’s workflow neatly but we provide guidance on how to achieve this in the most kedrific way we could think of in those docs 🙂
Yes, I read almost everything there, and take a deep dive into the PR process and discussion hahaha. My problem is that I need to provide some abstraction layer capable to keep a pandas kedro pipeline but at the same time being able to read, write and upsert to delta... so I think that I will generate a custom dataset for that but knowing all the pros/cons about it... i need to skip the logic-in-node proposal