Cory Maklin
08/17/2023, 3:18 PMAn exception occurred when parsing config for dataset 'raw_web@delta':
Object 'DeltaTableDataSet' cannot be loaded from 'kedro.extras.datasets.spark'. Please see the documentation on how to install relevant dependencies for kedro.extras.datasets.spark.DeltaTableDataSet:
<https://kedro.readthedocs.io/en/stable/kedro_project_setup/dependencies.html>
If I go to the link, it doesn't tell me what I should be installing.
I already have the following in the requirements.txt: kedro-datasets[spark,delta]
datajoely
08/17/2023, 3:19 PMpandas.DeltaTableDataSet
spark.DeltaTableDataSet
Juan Luis
08/17/2023, 3:19 PMdatajoely
08/17/2023, 3:19 PMJuan Luis
08/17/2023, 3:20 PM$ pip install "kedro-datasets[spark-base,hdfs-base,s3fs-base]" "delta-spark>=1.0, <3.0"
and try again?Cory Maklin
08/17/2023, 3:21 PMJuan Luis
08/17/2023, 4:40 PMpip install kedro-datasets[spark,delta]
instruction? we have a [spark]
extra but I don't see we ever had a [delta]
one, I could be wrong thoughtCory Maklin
08/17/2023, 4:54 PM[delta]
since it wasn't workingJuan Luis
08/17/2023, 4:56 PM[spark]
issue, the workaround I gave you should suffice in the meantime. otherwise let us knowCory Maklin
08/17/2023, 5:03 PMkedro-datasets[spark]
isn't enough apparentlyJuan Luis
08/18/2023, 6:23 AMkedro-datasets
1.5.3 fixed this issue, so hopefully next time pip install kedro-datasets[spark]
will suffice, let us know otherwise