https://kedro.org/ logo
#questions
Title
# questions
c

Camilo López

06/23/2023, 1:56 AM
Hi team, I'm using the new
ManagedTableDataSet
with Databricks Unity Catalog and I didn't find a way to store tables on a external location (ABFS Azure). There's a way of storing a external table with pure-spark :
df.write.mode(mode).option("path", table_path).saveAsTable(f"{catalog_name}.{schema_name}.{table_name}")
, where
table_path
its the path for the external location like
<abfss://container@storage_account.dfs.core.windows.net/raw>
. There's a way to pass this path to the
ManagedTableDataSet
when saving the data? Or should I go and create a
CustomManagedTableDataSet
with this capability?
👍🏼 1
Other way would be creating the tables before the pipeline runs
I could solve it creating a
ManagedExternalTableDataSet
that overrides the save functions like
data.write.format("delta").option("path", self.table_path)
n

Nok Lam Chan

06/23/2023, 1:21 PM
It sounds like this is quite common, do you think you can do a PR to add this into the dataset?
c

Camilo López

06/23/2023, 4:01 PM
Yeah sure i'll keep you posted!
i created the pr as a draft https://github.com/kedro-org/kedro-plugins/pull/251 . I still need to test if it works on dbricks properly
👍🏼 1
2 Views