https://kedro.org/ logo
#plugins-integrations
Title
# plugins-integrations
m

Mark Druffel

03/11/2024, 9:13 PM
I'm trying to use delta kedro-datasets
spark.DeltaTableDataset
on a newer version by installing kedro-datasets w/o dependencies as mentioned here. I'm trying to load my kedro session in databricks w/ line magic:
Copy code
%load_ext kedro.ipython
%reload_kedro
I got an error saying No module named 's3fs'. I'm using databricks on Azure w/ adlfs . Do I still need s3fs for credentials.yml to work? This is my catalog.yml entry causing the error:
Copy code
# Just testing on vanilla spark right now to get to a version that I know works
raw.media_meas_campaign_info:
  type: spark.SparkDataset
  filepath: abfss://~/media_meas_campaign_info/
  file_format: parquet
  credentials: azure_group
  load_args:
   header: True
   inferSchema: True
I installed
s3fs
and everything works fine, I was just surprised because I assumed I would only need that if I was in AWS and I got a bunch of errors when trying to install
s3fs>=2021.4,<2024.1
because databricks already had some underlying dependencies on a newer version that it wouldn't let pip uninstall.
d

Deepyaman Datta

03/11/2024, 9:15 PM
Yes, you're correct. I think I've just addressed that this is an unfortunate artifact that maybe nobody has complained about before? https://kedro-org.slack.com/archives/C03RKP2LW64/p1710191393254779?thread_ts=1710191179.698869&amp;cid=C03RKP2LW64
2 Views