Elena Mironova
08/01/2023, 1:24 PMkedro-datasets==1.5.0
, our CI started failing during system tests which do a kedro run
for a pipeline with spark (see the screenshot). As far as i can see, SparkDataSet
is still defined with the same name as before. When we used kedro-datasets==1.4.2
the same tests were running smoothly. I also couldn't find anything specific in the release notes - do we have to update our code (mb some import statements or how it is specified within the requirements)?Deepyaman Datta
08/01/2023, 1:47 PMkedro_datasets
in the apparent broken importElena Mironova
08/01/2023, 3:38 PM_csv: &csv
type: spark.SparkDataSet
file_format: csv
load_args:
sep: ","
header: True
inferSchema: True
save_args:
header: True
mode: overwrite
prm_observation_time_frame:
<<: *csv
filepath: data/03_primary/prm_observation_time_frame.csv
layer: primary
what confused me the most was that in 1.4.2 it workedDeepyaman Datta
08/01/2023, 4:18 PMkedro_datasets
.Erwin
08/01/2023, 4:23 PMkedro-datasets[spark-sparkdataset]~=1.5
Deepyaman Datta
08/01/2023, 4:25 PMspark-sparkdataset
extra I wonder?__all__
is getting populated in the discovery here (I thought I did check it when implementing, but not sure if something isn't working as expected); otherwise, nothing seems like it shouldn't work in my cursory pass through...Erwin
08/01/2023, 4:28 PMDeepyaman Datta
08/01/2023, 4:29 PMspark.SparkDataSet
extra on kedro-datasets
will do nothing. πNok Lam Chan
08/01/2023, 4:45 PMkedro-datasets[spark-sparkdataset]~=1.5
is what we intended, could be a temporary fix.Elena Mironova
08/01/2023, 4:57 PMsetup.cfg
of the starter, exactly the same as it was before, so i'd assume that correct extras are installed (however, can't confirm 100%, cause our CI commands only list full packages through pip freeze
)Nok Lam Chan
08/01/2023, 5:00 PMElena Mironova
08/02/2023, 7:13 AMNok Lam Chan
08/15/2023, 12:02 PMkedro-datasets[spark-sparkdataset]
This will not be supported since it was added unintentionally.Elena Mironova
08/15/2023, 12:24 PMkedro-datasets
in requirements, without optional extras?Nok Lam Chan
08/15/2023, 1:46 PMpip install kedro-datasets[spark.SparkDataSet]