Luiz Henrique Aguiar
12/19/2022, 1:15 PMsc = SparkContext(conf=spark_conf, appName="Kedro")
_spark_session = (
SparkSession.builder
.appName(context._package_name)
.enableHiveSupport()
.master("local[*,4]")
.getOrCreate()
)
_spark_session.sparkContext.setLogLevel("WARN")
Error:
Error: py4j.protocol.Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext. : org.apache.spark.SparkException: In Databricks, developers should utilize the shared SparkContext instead of creating one using the constructor. In Scala and Python notebooks, the shared context can be accessed as sc. When running a job, you can access the shared context by calling SparkContext.getOrCreate(). The other SparkContext was created at: CallSite(SparkContext at DatabricksILoop.scala:353,org.apache.spark.SparkContext.(SparkContext.scala:114)
I've tried to delete the Hook and make the Spark settings directly in the Cluster, without success. I have tried to configure it directly in the Spark Session, also without success. I also followed the instructions in the documentation to use a repository within Databricks, but since the base project does not use this Hook, it did not give the error. Has anyone had a similar error? I thought I could run if I turned the project into a Wheel, but I can't use the "kedro package" since the project can't run inside Databricks. I would be grateful for any ideas, thank you!Pavel Jenikovsky
12/19/2022, 10:18 PMkedro run
will be a way to go. You might rather want to create kedro session as described in https://kedro.readthedocs.io/en/stable/deployment/databricks.htmlLuiz Henrique Aguiar
12/20/2022, 4:01 PM