Hi all I am running Kedro to run as a Job in Databricks I am Kedro #questions

Hi all, I am running Kedro to run as a Job in Data...

Juan David Patiño Guerra

12/13/2023, 9:48 AM

Hi all, I am running Kedro to run as a Job in Databricks. I am getting the error in the screenshot attached. It tries to find the configuration in databricks/driver/conf/base, but it is passed (correctly I hope) in the sysargs as displayed in the second attachment. I'm using Databricks Asset Bundles to run it (is like dbx but officially supported by Databricks). Thanks in advance for your help! FYI I have been asking related questions on this side as well: https://kedro-org.slack.com/archives/C03RKP2LW64/p1702289803851329

🧱 1

datajoely

12/13/2023, 10:06 AM

So DABs are very new but it says that the

filepath does not exist or not accessible

at the very top of your screenshot

datajoely

12/13/2023, 10:06 AM

so we just need to work out what it can see at that working directory

datajoely

12/13/2023, 10:07 AM

perhaps a little script that prints out the cwd or even the file tree will help diagnose what the filepath needs to be

Juan David Patiño Guerra

12/13/2023, 10:47 AM

This is probably where it is trying to look (see attachment). But I wonder what is happening in the background that it tries to look for the config there, instead of the location I pointed to. I have tried to follow the steps here as much as possible, so I would hope it doesn't fail: https://docs.kedro.org/en/0.18.14/deployment/databricks/databricks_deployment_workflow.html On the other hand I believe DABs work pretty similar to dbx in the background, so I would hope this is not the issue.

Michał Madej

12/13/2023, 11:20 AM

You're using older runtime (below or equal 13.0), you may need to add "experimental: python_wheel_wrapper: true" to the top of your databricks.yml file

Michał Madej

12/13/2023, 11:23 AM

https://docs.databricks.com/en/dev-tools/bundles/python-wheel.html#step-3-explore-the-bundle

Juan David Patiño Guerra

12/13/2023, 12:53 PM

Thanks @Michał Madej,I tried the "experimental: python_wheel_wrapper: true" proposed in the link but didn't work. I upgraded to

spark_version: 13.3.x-cpu-ml-scala2.12

and I get the (new) error shown attached - I made screenshots of the full trace (even more cryptic). Any ideas? Thanks for your quick help so far guys!

datajoely

12/13/2023, 12:59 PM

have you packaged your

spark.yaml

Juan David Patiño Guerra

12/13/2023, 1:08 PM

not really... and I'm not aware of having to do that from the instructions. I believe you mean

databricks.yaml

from the DABs? If so, I think that one needs to stay in the root of the repo.

datajoely

12/13/2023, 1:09 PM

value cannot be null for spark.app.name

datajoely

12/13/2023, 1:10 PM

so Kedro spark app name is usually defined in the SparkHooks

Michał Madej

12/13/2023, 2:40 PM

I don't know about spark, but DAB uploads your configuration directory to this location

/Workspace/Users/${workspace.current_user.userName}/.bundle/${bundle.target}/${bundle.name}

, try using it in

parameters: ["--conf-source", "here", ...]

Juan David Patiño Guerra

12/14/2023, 9:45 AM

Thanks for your replies @datajoely and @Michał Madej. Based on what @datajoely said, in the SparkHooks I tried changing

SparkSession.builder.appName(context.__package_name_)

SparkSession.builder.appName(context._project_path.name_)

, which is what is shown in the latest documentation. I do end up (again) in the same config error from above which says

ValueError: Given configuration path either does not exist or is not a valid directory: /databricks/driver/conf/base

. After diving in with more detail, I see that it gets into the MLFlow tracking hook, and fails when using the ConfigLoader (see traceback on the screenshot attached). The code that it is calling is the following:

Copy code

class MLFlowTrackingHooks:
    """Namespace for grouping all model-tracking hooks with MLflow together."""

    def load_parameters(self, run_params):
        project_path = run_params["project_path"]
        conf_loader = ConfigLoader(conf_source=f"{project_path}/{settings.CONF_SOURCE}")
        parameters = conf_loader.get("parameters*", "parameters*/**")
        return parameters

    @hook_impl
    def before_pipeline_run(self, run_params: Dict[str, Any]) -> None:
        """Hook implementation to start an MLflow run
        with the session_id of the Kedro pipeline run.
        """
        parameters = self.load_parameters(run_params)
        experiment_name = parameters["mlflow_experiment_name"]
        mlflow.set_experiment(experiment_name)
        exp_id = mlflow.get_experiment_by_name(experiment_name).experiment_id
        mlflow.start_run(run_name=run_params["session_id"], experiment_id=exp_id)
        mlflow.log_params(run_params)

Does the type of config loader has something to do here? I'm still using DBR 13.3 LTS ML. Thanks again for thinking along.

Juan David Patiño Guerra

12/18/2023, 10:46 AM

Hi here, I just wanted to share the solution that I found in the end. The issue was happening due to the ConfigLoader trying to read the config folder from the default location close at the project root, this was happening in an mlflow hook in

hooks.py

. Because this project path is in

databricks/driver

folder in databricks jobs , it was failing there. The solution was to point the ConfigLoader to look for the config folder in the dbfs that I copied it to when I had to run the pipeline. After that it works! I hope that with the development and growth of Kedro with deployment as databricks jobs gets better and smoother!

datajoely

12/18/2023, 10:52 AM

That’s super helpful thanks for the update

datajoely

12/18/2023, 10:52 AM

I’d really like to think about how we could provide a better error message

Juan David Patiño Guerra

12/18/2023, 11:14 AM

I think I just had to look properly at the error message to see that it was jumping into the hook and failing there. I think it is just the different layers that made it a bit distracting to catch that one up. In the end this was true the whole time:

ValueError: Given configuration path either does not exist or is not a valid directory: /databricks/driver/conf/base

. Is just that seeing it fail in that directory "felt" complex, but in the trace you could see that it was going through the hooks. It is a bit of a Databricks extra complexity that didn't help.

datajoely

12/18/2023, 12:12 PM

and for reference what was the correct directory?

Juan David Patiño Guerra

12/18/2023, 12:23 PM

I used this location in dbfs to copy the config file so it could be read :

/dbfs/FileStore/wine_model_kedro/conf/

👍 1

2 Views

Open in Slack

Previous Next