Luiz Henrique Aguiar12/19/2022, 1:15 PM
Error: Error: py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. : org.apache.spark.SparkException: In Databricks, developers should utilize the shared SparkContext instead of creating one using the constructor. In Scala and Python notebooks, the shared context can be accessed as sc. When running a job, you can access the shared context by calling SparkContext.getOrCreate(). The other SparkContext was created at: CallSite(SparkContext at DatabricksILoop.scala:353,org.apache.spark.SparkContext.(SparkContext.scala:114) I've tried to delete the Hook and make the Spark settings directly in the Cluster, without success. I have tried to configure it directly in the Spark Session, also without success. I also followed the instructions in the documentation to use a repository within Databricks, but since the base project does not use this Hook, it did not give the error. Has anyone had a similar error? I thought I could run if I turned the project into a Wheel, but I can't use the "kedro package" since the project can't run inside Databricks. I would be grateful for any ideas, thank you!
sc = SparkContext(conf=spark_conf, appName="Kedro") _spark_session = ( SparkSession.builder .appName(context._package_name) .enableHiveSupport() .master("local[*,4]") .getOrCreate() ) _spark_session.sparkContext.setLogLevel("WARN")
Pavel Jenikovsky12/19/2022, 10:18 PM
will be a way to go. You might rather want to create kedro session as described in https://kedro.readthedocs.io/en/stable/deployment/databricks.html
Luiz Henrique Aguiar12/20/2022, 4:01 PM