Kedro is an open-sourced Python framework for creating maintainable and modular data science code.

Kedro

image.png

Hi Team...we are trying to run pyspark job in dataproc cluster...following steps are followed(pls refer screenshot): 1. wheel file was generated for the project 2. pushed wheel file and conf, logs folders/files into the dataproc cluster 3. pip install wheel 4. run kedro . when running kedro, it throws error below ...Can you pls help what are we missing here :ERROR org.apache.spark.SparkContext: Error initializing SparkContext.
org.apache.hadoop.ipc.RemoteException(java.io.IOException): File /user/root/.sparkStaging/application_1677266242748_0002/pyspark.zip could only be written to 0 of the 1 minReplication nodes. There are 0 datanode(s) running and 0 node(s) are excluded in this operation.

*<@U03S12LHNNQ> Hi Deepyaman....any pointer on this pls...*

Doesn't look like a Kedro issue to me, sorry.