ey guys! Just using today kedro 0.19.6 with iris-...
# questions
e
ey guys! Just using today kedro 0.19.6 with iris-databricks starter. (https://github.com/kedro-org/kedro-starters/blob/db79aec64c4a0f062321bd8c74ee78275[…]-iris/%7B%7B%20cookiecutter.repo_name%20%7D%7D/requirements.txt)
Copy code
kedro-datasets[spark.SparkDataset, pandas.ParquetDataset]>=1.0
I got the following today
Copy code
WARNING: kedro-datasets 3.0.1 does not provide the extra 'pandas.parquetdataset'
WARNING: kedro-datasets 3.0.1 does not provide the extra 'spark.sparkdataset'
d
Can you try
kedro-datasets[pandas-parquetdataset]
?
j
it's a bug in the starter indeed!
d
I don't fully understand to be honest—why does
pandas.ParquetDataset
not get normalized to
pandas-parquetdataset
such that users can keep writing with the dot? @Juan Luis feel like you know 🙂
e
Should I change from
Copy code
kedro-datasets[spark.SparkDataset, pandas.ParquetDataset]>=1.0
to
Copy code
kedro-datasets[spark-sparkdataset, pandas-parquetdataset]>=1.0
? Or is it expected to get normalized?
👍 1
d
Try changing it and see if it works 🙂
n
I think it depends on your pip version.
n
Using the normalised form is preferred imo, if you bump your pip version I expect it get normalised automatically
j
agreed with @Nok Lam Chan, it depends on the pip version
e
great, however this is databricks runtime. not sure if I can modify pip version or should. understanding that databricks runtimes are supposed to be "well tested" environments
🤮 1
n
The change was introduce in 23.x IIRC, that was the motivation for us to switch to the normalised form
d
Using the normalised form is preferred imo
I feel like the whole point of normalization is that you don't have to do the "preferred" thing. Since
mypackage-somereallyreallylongandunreadabledataset
is harder to read than
mypackage.SomeReallyReallyLongAndUnreadableDataset
🙂 But point taken on it being introduced recently to pip
(and that other managers may not fully conform to normalization rules)
j
specifically, https://github.com/pypa/pip/issues/11715 which shipped PEP 685
d
So not even released (non-beta) maybe?
j
n
But point taken on it being introduced recently to pip
I understand the argument. Like you said, there are many package manager and we have no control over the version that user is using.
pip
is usually consider a build requirements, so we cannot pin
pip>=23.3
as part of the requirement. Another thing that I don't understand is
pip install kedro[notexist]
will works fine. It just ignore the extra, so in case of older
pip
version it's very hard to know if you actually install the correct version of not.
^ cause a lot of problem before as CI /pip install all looks fine until the pipeline fails