Mark Druffel
03/11/2024, 9:13 PMspark.DeltaTableDataset
on a newer version by installing kedro-datasets w/o dependencies as mentioned here. I'm trying to load my kedro session in databricks w/ line magic:
%load_ext kedro.ipython
%reload_kedro
I got an error saying No module named 's3fs'. I'm using databricks on Azure w/ adlfs . Do I still need s3fs for credentials.yml to work? This is my catalog.yml entry causing the error:
# Just testing on vanilla spark right now to get to a version that I know works
raw.media_meas_campaign_info:
type: spark.SparkDataset
filepath: abfss://~/media_meas_campaign_info/
file_format: parquet
credentials: azure_group
load_args:
header: True
inferSchema: True
I installed s3fs
and everything works fine, I was just surprised because I assumed I would only need that if I was in AWS and I got a bunch of errors when trying to install s3fs>=2021.4,<2024.1
because databricks already had some underlying dependencies on a newer version that it wouldn't let pip uninstall.Deepyaman Datta
03/11/2024, 9:15 PM