hey ! has anyone managed to use files on the datab...
# questions
g
hey ! has anyone managed to use files on the databricks filesystem as a data source? getting DatasetError: No partitions found in '/dbfs/FileStore/myproject/queries/' but the files are there
n
Can you give a bit more context? What dataset are you using, how are you using it, what version of kedro/kedro-datasets?
1. dbfs is not recommended by Databricks anymore, I think they switch to unity catalog in general. 2. If you are using versioned Dataset, it may not work very well with this. As Spark used it own authentication on Databricks with some magic filepath handling.
fsspec
(what kedro used for versioned dataset) did not implement that Try the Databricks dataset instead if this doesn'twork
👍 1
g
hey Nok, sure, it's a partitioned text dataset. not versioned.
myns.mydataset:
type: partitions.PartitionedDataset
path: ${_mypath}
dataset: text.TextDataset
should it normally work on /dbfs/ ?
n
interesting, I expect it should. Could you first test with a single text.TextDataset?
If this work then maybe it's an issue with
PartitionedDataset
g
worked after using native
dbutils
commands for file transfers and
dbfs:/
instead of
/dbfs/
paths
n
It should work with both uri style or posix style. https://docs.databricks.com/aws/en/files/