Hey team, I'm getting the following exception: ```...
# questions
c
Hey team, I'm getting the following exception:
Copy code
An exception occurred when parsing config for dataset 'raw_web@delta':
Object 'DeltaTableDataSet' cannot be loaded from 'kedro.extras.datasets.spark'. Please see the documentation on how to install relevant dependencies for kedro.extras.datasets.spark.DeltaTableDataSet:
<https://kedro.readthedocs.io/en/stable/kedro_project_setup/dependencies.html>
If I go to the link, it doesn't tell me what I should be installing. I already have the following in the requirements.txt:
kedro-datasets[spark,delta]
d
There are two delta datasets, you need to declare them explicitly
pandas.DeltaTableDataSet
spark.DeltaTableDataSet
j
hey @Cory Maklin, not sure if it's related but we're having some issues with the additional dependencies in kedro-datasets these days.
d
j
can you try
Copy code
$ pip install "kedro-datasets[spark-base,hdfs-base,s3fs-base]" "delta-spark>=1.0, <3.0"
and try again?
c
Thanks
I'll give it a try
j
also @Cory Maklin where did you see the
pip install kedro-datasets[spark,delta]
instruction? we have a
[spark]
extra but I don't see we ever had a
[delta]
one, I could be wrong thought
c
I just added
[delta]
since it wasn't working
j
fair enough, thanks 👍🏼
we're about to fix the
[spark]
issue, the workaround I gave you should suffice in the meantime. otherwise let us know
c
pip install "kedro-datasets[spark-base,hdfs-base,s3fs-base]"
This fixed it
1
kedro-datasets[spark]
isn't enough apparently
j
yeah it's broken atm. kedro-datasets next release (today or Friday) will fix it
hy @Cory Maklin,
kedro-datasets
1.5.3 fixed this issue, so hopefully next time
pip install kedro-datasets[spark]
will suffice, let us know otherwise