Elena Mironova08/01/2023, 1:24 PM
, our CI started failing during system tests which do a
for a pipeline with spark (see the screenshot). As far as i can see,
is still defined with the same name as before. When we used
the same tests were running smoothly. I also couldn't find anything specific in the release notes - do we have to update our code (mb some import statements or how it is specified within the requirements)?
Deepyaman Datta08/01/2023, 1:47 PM
in the apparent broken import
Elena Mironova08/01/2023, 3:38 PM
what confused me the most was that in 1.4.2 it worked
_csv: &csv type: spark.SparkDataSet file_format: csv load_args: sep: "," header: True inferSchema: True save_args: header: True mode: overwrite prm_observation_time_frame: <<: *csv filepath: data/03_primary/prm_observation_time_frame.csv layer: primary
Deepyaman Datta08/01/2023, 4:18 PM
Erwin08/01/2023, 4:23 PM
Deepyaman Datta08/01/2023, 4:25 PM
extra I wonder?
is getting populated in the discovery here (I thought I did check it when implementing, but not sure if something isn't working as expected); otherwise, nothing seems like it shouldn't work in my cursory pass through...
Erwin08/01/2023, 4:28 PM
Deepyaman Datta08/01/2023, 4:29 PM
will do nothing. 🙂
Nok Lam Chan08/01/2023, 4:45 PM
is what we intended, could be a temporary fix.
Elena Mironova08/01/2023, 4:57 PM
of the starter, exactly the same as it was before, so i'd assume that correct extras are installed (however, can't confirm 100%, cause our CI commands only list full packages through
Nok Lam Chan08/01/2023, 5:00 PM
Elena Mironova08/02/2023, 7:13 AM
Nok Lam Chan08/15/2023, 12:02 PM
This will not be supported since it was added unintentionally.
Elena Mironova08/15/2023, 12:24 PM
in requirements, without optional extras?
Nok Lam Chan08/15/2023, 1:46 PM
pip install kedro-datasets[spark.SparkDataSet]