Hi everyone, I am facing the following issue when ...
# questions
a
Hi everyone, I am facing the following issue when I am trying to read a CSV with spark. With pandas, it works fine but with spark, it seems I need some extra configurations. Could you please point me in the right direction? thank you in advance!
j
welcome @Andreas_Kokolantonakis! I think the file was deleted and cannot be seen
a
yep sorry
Screenshot 2023-05-09 at 17.57.33.png
j
you should add s3a:// to the filepath
sorry! Hi! hehehe
a
tried all the recommended approaches, still getting the same error
I used context.py to load a new session with s3 configurations. no luck, is anyone able to help me further please?! thanks
j
could you paste the traceback that corresponds to the
s3a://
protocol @Andreas_Kokolantonakis?
a
@Juan Luis
I used the pyspark starter, and added the s3AFile system as a configuration in the spark.yml file
but still no luck
j
Now you have a different error now, you are missing the AWS connector in your spark's classpath
j
the error is slightly different now indeed:
Class org.apache.hadoop.fs.s3a.S3AFileSystem not found
(leaving it written for easier copy-paste)
a
yep, but I am adding this config in spark.yml. How I can make sure that kedro will use ProjectContext(KedroContext) to initialise the spark session?
j
searching the channel for that error I got: https://kedro-org.slack.com/archives/C03RKP2LW64/p1681471022814169
a
will try to follow this!
hi everyone, how I can make sure that kedro will use spark.yml and context.py when initializing a spark session? does it work out of the box or I need somehow to point at it? thanks!
j
hi everyone, how I can make sure that kedro will use spark.yml and context.py when initializing a spark session? does it work out of the box or I need somehow to point at it? thanks!
n
@Andreas_Kokolantonakis What version of kedro are you using? If you are using 0.18.x you should be able to do this with https://docs.kedro.org/en/stable/integrations/pyspark_integration.html#initialise-a-sparksession-using-a-hook
j
I think it's the other Andreas @Andreas_Kokolantonakis 😄
a
@Haris Michailidis
n
😅sorry I need to work on my skills at tagging people…
🙈 1