Can I use an existing Databricks cluster with kedr...
# questions
a
Can I use an existing Databricks cluster with kedro-databricks? By default, kedro-databricks tries to create new resources, which my account doesn’t have permission to do. Is there a way to specify an existing cluster ID instead? I tried editing
conf/dev/databricks.yml
with the following code:
Copy code
default:
  tasks:
  - existing_cluster_id: 0924-121047-3jcdtqh1
But running
kedro databricks bundle --overwrite
raises an error:
Copy code
AssertionError: lookup_key task_key not found in updates: [{'existing_cluster_id': '0924-121047-3jcdtqh1'}]
It should work - haven't used this in a while but from docs it's task_keys instead of tasks, maybe you have a typo?
a
Thank you, Nok! As always helpful. Now I don't get this error. But it still tries to create a cluster and I still get Permission denied error. My current
databricks.yml
looks like this:
Copy code
default:
  tasks:
  - task_key: default
    existing_cluster_id: 0924-121047-3jcdtqh1
n
So the kedro databricks bundle --overwrite command work and you already run the deploy command?
a
yes
The default yaml looks like this:
Copy code
default:
  job_clusters:
  - job_cluster_key: default
    new_cluster:
      node_type_id: Standard_DS3_v2
      num_workers: 1
      spark_env_vars:
        KEDRO_LOGGING_CONFIG: /\${workspace.file_path}/conf/logging.yml
      spark_version: 15.4.x-scala2.12
  tasks:
  - job_cluster_key: default
    task_key: default
And it tries to create a cluster like that
n
Check if you may still have some of the bundle creating a cluster instead of use the existing one. It may be helpful if you can reduce this to a single task to debug first then basically do a binary search and see which part of the config is problematic
Check the bundle definition
a
From the first look it seems that it is not using new bundle definitions, but uses some old ones, but I'm not sure from where it gets them
n
If you have deploy databricksassets bundle, I think you can see those job definitions from databricks UI directly
a
Okay, this was my mistake that I didn't deploy them after changes
Thank you Nok. You are the best!
❤️ 1
j
Sorry I hadn’t seen this. Thank you @Nok Lam Chan for helping out 😊
❤️ 1