Hi I have a weird error. I have a databricks work...
# questions
e
Hi I have a weird error. I have a databricks workflows, which clone the repo and uses a notebook. So first step 1: 1. identify where the code was cloned
2. intall reqs
3. load catalog to show some data
but fails:
Copy code
MissingConfigException: No files of YAML or JSON format found in /Workspace/Repos/.internal/02476ba86f_commits/880770e941404812856252f77bc24948806a60c2/conf/base or /Workspace/Repos/.internal/02476ba86f_commits/880770e941404812856252f77bc24948806a60c2/conf/databricks_dev matching the glob pattern(s): ['spark*/']
which is weird cause the spark file exists:
this way of executing worked untill kedro 0.18.12.
a
Try changing the
config_pattern
to
['spark*']
I think it might be looking for the spark config in a folder starting with
spark
e
I chaged it but still same error:
Copy code
File /Workspace/Repos/.internal/02476ba86f_commits/8707aed67bbe02e5476a3b8cea916cb72591f1b4/src/ml_minsur_de_lingo/hooks.py:26, in SparkHooks.after_context_created(self, context)
     21 """Initialises a SparkSession using the config
     22 defined in project's conf folder.
     23 """
     25 # Load the spark configuration in spark.yaml using the config loader
---> 26 parameters = context.config_loader.get("spark")
     28 spark_conf = SparkConf().setAll(parameters.items())
     30 # Initialise the spark session

File /usr/lib/python3.10/_collections_abc.py:819, in Mapping.get(self, key, default)
    817 'D.get(k[,d]) -> D[k] if k in D, else d.  d defaults to None.'
    818 try:
--> 819     return self[key]
    820 except KeyError:
    821     return default

File /local_disk0/.ephemeral_nfs/envs/pythonEnv-56c57ddd-cda0-4310-8d72-02a8d8b856d7/lib/python3.10/site-packages/kedro/config/omegaconf_config.py:208, in OmegaConfigLoader.__getitem__(self, key)
    205 config.update(env_config)
    207 if not processed_files and key != "globals":
--> 208     raise MissingConfigException(
    209         f"No files of YAML or JSON format found in {base_path} or {env_path} matching"
    210         f" the glob pattern(s): {[*self.config_patterns[key]]}"
    211     )
    212 return config

MissingConfigException: No files of YAML or JSON format found in /Workspace/Repos/.internal/02476ba86f_commits/8707aed67bbe02e5476a3b8cea916cb72591f1b4/conf/base or /Workspace/Repos/.internal/02476ba86f_commits/8707aed67bbe02e5476a3b8cea916cb72591f1b4/conf/databricks_dev matching the glob pattern(s): ['spark*']
inside
conf/databricks_dev
there is no spark.yml, but the one from base is supposed to be loaded
a
Ooh try doing
parameters = config_loader["spark"]
in your
SparkHook
👍 1
e
did not work. this is really weird, in my local computer works fine even using the flag
--env databricks_dev
I will try to debug more.
@Ankita Katiyar I was able to reproduce the error 1. with this path
/Users/Erwin_Paillacan/Projects/Workspace/Repos/.internal/02476ba86f_commits/e3754f960754e93d5dadcd17a74901cf7fb65fda/some_project/conf/base
fails 2. with this one works fine:
/Users/Erwin_Paillacan/Projects/Workspace/Repos/internal/02476ba86f_commits/e3754f960754e93d5dadcd17a74901cf7fb65fda/some_project/conf/base
the only difference in both path is
internal
and
.internal
, when a dot is in the path the config loader gets confused and fails.
Copy code
No files of YAML or JSON format found in /Workspace/Repos/.internal/02476ba86f_commits/20adcb96c8bec99ac2fad8b78b25158e7d968fa4/conf/base or /Workspace/Repos/.internal/02476ba86f_commits/20adcb96c8bec99ac2fad8b78b25158e7d968fa4/conf/databricks_dev matching the glob pattern(s): ['catalog*', 'catalog*/**', '**/catalog*']
👍 1
in kedro 0.18.12 and OmegaConfigLoader the dot was not a problem
a
Ahh, I see, this was a bug we fixed recently - https://github.com/kedro-org/kedro/pull/2977
But I suppose it shouldn’t be excluding hidden folders outside of
conf/
. Thanks for opening an issue for it!
👍 1
e