Friends I m starting to use `kedro` with `uv` If I start the Kedro #questions

Friends, I'm starting to use `kedro` with `uv`. If...

Davi Sales Barreira

04/11/2025, 5:42 PM

Friends, I'm starting to use

kedro

with

uv

. If I start the package with Pyspark, I get an error. Here are the steps to reproduce. Start running:

Copy code

uvx kedro new

When prompted, I choose the option to install all tools (this includes pyspark). The project is created. I get into the directory and run:

Copy code

uv run ipython

Inside ipython, if I try

%load_ext kedro.ipython

, then I get the error:

Copy code

The operation couldn't be completed. Unable to locate a Java Runtime.
Please visit <http://www.java.com> for information on installing Java.

/Users/davi/test/.venv/lib/python3.11/site-packages/pyspark/bin/spark-class: line 97: CMD: bad array subscript
head: illegal line count -- -1
╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
│ in <module>:1                                                                                    │
│                                                                                                  │
│ /Users/davi/test/.venv/lib/python3.11/site-packages/IPyt │
│ hon/core/interactiveshell.py:2482 in run_line_magic                                              │
│                                                                                                  │
│   2479 │   │   │   if getattr(fn, "needs_local_scope", False):                                   │
│   2480 │   │   │   │   kwargs['local_ns'] = self.get_local_scope(stack_depth)                    │
│   2481 │   │   │   with self.builtin_trap:                                                       │
│ ❱ 2482 │   │   │   │   result = fn(*args, **kwargs)
....
PySparkRuntimeError: [JAVA_GATEWAY_EXITED] Java gateway process exited before sending its port number.

Any idea on what might be happening? BTW, I'm on a Mac.

👀 1

Ravi Kumar Pilla

04/11/2025, 5:51 PM

The error seems to be due to the unavailability of JRE. Is this the first time you are using pyspark ? Or this happened now ? Could you please check your

conf/base/spark.yml

👍 2

Ravi Kumar Pilla

04/11/2025, 5:59 PM

If you have java configured on your machine and you also did

pip install -r requirements.txt

for the project, with default spark.yml below, there should not be any issues. Please check if JRE is available (

java -version

)

Copy code

spark.driver.maxResultSize: 3g
spark.hadoop.fs.s3a.impl: org.apache.hadoop.fs.s3a.S3AFileSystem
spark.sql.execution.arrow.pyspark.enabled: true

# <https://docs.kedro.org/en/stable/integrations/pyspark_integration.html#tips-for-maximising-concurrency-using-threadrunner>
spark.scheduler.mode: FAIR

Davi Sales Barreira

04/11/2025, 6:14 PM

thanks, @Ravi Kumar Pilla. I've changed to a mac recently, and had not installed the openjdk.

👍 1

Ravi Kumar Pilla

04/11/2025, 6:16 PM

Installing should fix the issue. If you still face issue, please let us know. Thank you

Davi Sales Barreira

04/11/2025, 6:23 PM

it worked like a charm. Sorry for the inconvenience.

🥳 3

np 1

8 Views

Open in Slack

Previous Next