Emilio Gagliardi
08/06/2023, 2:52 AM<http://kedro.contrib.io|kedro.contrib.io>.azure.JSONBlobDataSet
which I can't find in the documentation under 18.12, but under 15.6. Did something change in the way kedro organizes contrib.io? GPT 4 also said that the built-in kedro JSON dataset doesn't work on azure. Any guidance is appreciated. THanks kindly,
my_partitioned_dataset:
type: kedro.io.PartitionedDataSet
path: <your_blob_folder_path>
credentials: azure_blob_storage
dataset:
type: kedro.contrib.io.azure.JSONBlobDataSet <- is this valid?
container_name: <your_container_name>
credentials: azure_blob_storage
Deepyaman Datta
08/06/2023, 2:01 PMJSONBlobDataSet
was removed in Kedro 0.16, along with a lot of storage-specific datasets.
I don't know why the pandas.JSONDataSet
shouldn't work; not sure I would trust GPT 4. See https://stackoverflow.com/a/69941391/1093967 for example; fsspec
should be able to handle Azure blob same way as other storage backends.Emilio Gagliardi
08/07/2023, 2:43 AMDeepyaman Datta
08/07/2023, 4:11 AMpandas.JSONDataSet
since that was similar to the behavior of old <http://kedro.contrib.io|kedro.contrib.io>.azure.JSONBlobDataSet
, which also produces a dataframe.Emilio Gagliardi
08/08/2023, 8:35 AMError loading cleaned-emails-20230806003837.json: Failed while loading data from data set JSONDataSet(filepath=cleaned-emails/cleaned-emails-20230806003837.json, protocol=abfs).
Expected object or value
there is a json object in the underlying file... any ideas greatly appreciated!Nok Lam Chan
08/09/2023, 8:06 PMEmilio Gagliardi
08/14/2023, 6:57 PMNok Lam Chan
08/14/2023, 8:20 PM