https://kedro.org/ logo
#questions
Title
# questions
n

Nikos Kaltsas

02/15/2023, 12:11 AM
Hello, anyone have a guide / example for running Kedro pipelines on DataBricks with dbx?
❤️ 1
d

datajoely

02/15/2023, 9:56 AM
the docs are being revised currently after full research round https://github.com/kedro-org/kedro/issues/2105
t

Toni - TomTom - Madrid

02/15/2023, 12:19 PM
Very interesting! We are facing same challenges on Databricks. By the way, have you tackle how to input secrets from Azure KeyVault or DBs secrets?
d

datajoely

02/15/2023, 12:32 PM
environment variables are often the right solution to this
there are ways to do this today, but will get much easier in the next minor release of kedro
❤️ 1
j

Jannic Holzer

02/15/2023, 1:58 PM
Hey Nikos and Toni, we're defining a best-practices way of working with Kedro on Databricks (with a focus on using
dbx
) under this ticket 😃 we'll have a guide very soon after that
❤️ 1
n

Nikos Kaltsas

02/15/2023, 3:14 PM
Thanks all - having a look through docs 🙏
t

Toni - TomTom - Madrid

02/15/2023, 5:02 PM
thanks a lot! sharing this with my team 🙂
m

Mathilde Lavacquery

02/17/2023, 2:06 PM
Hi @datajoely, I am also interested in the current ways existing to input Azure KeyVault or DB secrets, would it be possible to share the examples ?
❤️ 1
t

Toni - TomTom - Madrid

02/20/2023, 8:21 AM
Same as @Mathilde Lavacquery, we still dont have a simple way to connect through credentials from catalog when running on Databricks cluster. Environment variables do not work if you read from credentials.yml (unless you edit your data type I guess). Could you clarify how do you recommend to deal with credentials in a cluster? This is the weakest point we found for kedro so far and we haven’t fix this yet 😞 (Thanks a lot in advance!)
j

Jannic Holzer

02/20/2023, 10:23 AM
Hey Toni, thanks very much for bringing this up, this is interesting. I don't think we have official advice for this yet. I'll open a ticket to investigate / create a solution for this, though it would be good to know: Why doesn't setting the environment variables for the cluster through the DB UI (like this) work in this case? My understanding is that the workers would be able to read these env variables
❤️ 1
d

datajoely

02/20/2023, 11:56 AM
so this is the best way to do this via env vars today
it will get better shortly
🥂 1
t

Toni - TomTom - Madrid

02/21/2023, 11:39 AM
thanks a lot, let me share this with the team 🙂
K 1
12 Views