Luca Disse
04/19/2023, 2:32 PMDeepyaman Datta
04/19/2023, 4:07 PM/dbfs
, you are using Databricks dbutils in order to handle the reading. Are you able to access the file using pandas (e.g. pd.read_csv("/dbfs/whatever/kvi_metrics/item/merged_metrics/merged_metrics.parquet")
)? I'm guessing not, so then you need to sort out your ability to access that path not using Databricks.
Also, need some more info--are you running this from your local using some sort of remote execution? Pandas code runs locally and will not execute on the cluster in that case (on the off chance you're doing that).
(Not really related, but are you writing a single partition? The path looks awkward, given I'd expect Spark will create a folder structure under .../merged_metrics/merged_metrics.parquet/
-- but I could be wrong, since I haven't used Spark in forever)Luca Disse
04/20/2023, 1:13 PM