Hi team, I'm using the new `ManagedTableDataSet` w...
# questions
c
Hi team, I'm using the new
ManagedTableDataSet
with Databricks Unity Catalog and I didn't find a way to store tables on a external location (ABFS Azure). There's a way of storing a external table with pure-spark :
df.write.mode(mode).option("path", table_path).saveAsTable(f"{catalog_name}.{schema_name}.{table_name}")
, where
table_path
its the path for the external location like
<abfss://container@storage_account.dfs.core.windows.net/raw>
. There's a way to pass this path to the
ManagedTableDataSet
when saving the data? Or should I go and create a
CustomManagedTableDataSet
with this capability?
馃憤馃徏 1
Other way would be creating the tables before the pipeline runs
I could solve it creating a
ManagedExternalTableDataSet
that overrides the save functions like
data.write.format("delta").option("path", self.table_path)
n
It sounds like this is quite common, do you think you can do a PR to add this into the dataset?
c
Yeah sure i'll keep you posted!
i created the pr as a draft https://github.com/kedro-org/kedro-plugins/pull/251 . I still need to test if it works on dbricks properly
馃憤馃徏 1