Quick question, why is kedro-datasets no longer co...
# questions
m
Quick question, why is kedro-datasets no longer compatible with Spark 3.4?
d
looking into this
m
Just FYI, have no problems with
pyspark==3.4.1
and SparkDataSet from kedro.extras: https://github.com/kedro-org/kedro/blob/0293dc15812b27330bba31a01c7b332b3165af2a/kedro/extras/datasets/spark/spark_dataset.py (Kedro 0.18.12), haven’t tested with
kedro-datasets
though.
d
but trying to work out why
current hypothesis is that maybe 3.11 support was the issue
🤔 1
equally the dates are confusing here, 3.11 support was added to PySpark in 3.4.0
m
Yeah the dates are because Spark provides patch updates for several versions
👍 2
My guess is that it has something to do with the exception handling in the SparkDataset. There is a deprecation comment in that file. But that should be no reason to remove support for Spark 3.4. In fact, this is something that needs to be fixed in SparkDataset…
👍🏼 1
d
Yes agreed
the person who made this change is at the 🦷 dentist, will get an answer to you shortly
🙃 1
🤪 1
😂 1
😅 1
s
Hi, I believe this was an oversight as we moved between
pyproject.toml
and
setup.py
midway through the ticket.
❤️ 1
d
thanks @Sajid Alam
m
So it should be ok with 3.4?
s
Yes!
d
if you want to force it, it will work
m
But it will be fixed with a 1.6.1 release?
s
Yes, there is a PR out for it now: https://github.com/kedro-org/kedro-plugins/pull/323
👍 1