Hi All I was wondering if there was a way to install only th Kedro #questions

Hi All, I was wondering if there was a way to inst...

Jorit Studer

11/10/2023, 7:02 AM

Hi All, I was wondering if there was a way to install only the Azure Stack dependencies of Kedro. Thinking of something along the lines of

kedro[azure]

marrrcin

11/10/2023, 7:05 AM

Azure Stack as in https://azure.microsoft.com/en-us/products/azure-stack ?

Jorit Studer

11/10/2023, 7:07 AM

yes more like exclude dependencies such as

botocore

, which seems to be AWS specific.

Nok Lam Chan

11/10/2023, 7:41 AM

At the moment you need to do this manually. Kedro core dependency doesn’t come with

botocore

https://github.com/kedro-org/kedro-plugins/blob/main/kedro-datasets/setup.py

botocore

is most likely come with

S3FS

, you can look at the specific dataset that you need and add that your project. This mean that you cannot do

pip install kedro-datasets[spark.SparkDataSet]

Jorit Studer

11/10/2023, 7:49 AM

ah this is what I did precisely, thanks for the info! that would be a very welcome addition in the future since it would decrease overall install times for users, decrease docker image sizes and most importantly cause less headaches if some dependencies are internally blocked due to some security vulnerability scan anyways thanks for the quick response and the great work you guys are doing on kedro!

Nok Lam Chan

11/10/2023, 7:52 AM

I see what you mean. If you look at the Spark requirements

Copy code

spark_require = {
    "spark.SparkDataSet": [SPARK, HDFS, S3FS]
,...
}

It’s not really just Spark but also HDFS and S3FS. (who still use HDFS these days?) We can potentially separate the storage. In the past most of our users are using

s3

I guess that’s why it’s bundled. From the dependencies point of view, it is better to separate it. It does make the installation a bit longer and is a breaking change.

pip install kedro-dataset[spark.SparkDataset]

may become

pip install kedro-dataset[spark.SparkDataset, s3]

Cc @Juan Luis

Nok Lam Chan

11/10/2023, 7:52 AM

In practice what will be in the

azure

extra dependecies?

Jorit Studer

11/10/2023, 12:46 PM

I actually don’t know I will have to check at work and get back to you!

K 1

2 Views

Open in Slack

Previous Next