Hi all, I am having trouble loading an ADLS2 cont...
# questions
t
Hi all, I am having trouble loading an ADLS2 container with the
abfs://
protocol. I am providing it as followed:
Copy code
raw_dataset:
  type: pandas.CSVDataset
  filepath: "<abfs://container/file.csv>"
  credentials: azure_credentials
with azure credentials being:
Copy code
azure_credentials:
  account_name: "name_datastore" (gotten from Data-container page in Azure AI Machine Learning Studio)
  account_key: "eyJ...." (gotten from `az account get-access-token` in a Compute instance in Azure AI Machine learning studioe)
I am getting the following error:
Copy code
File "/anaconda/envs/py311/lib/python3.11/site-packages/azure/storage/blob/_shared/authentication.py", line 152, in on_request
    self._add_authorization_header(request, string_to_sign)
  File "/anaconda/envs/py311/lib/python3.11/site-packages/azure/storage/blob/_shared/authentication.py", line 135, in _add_authorization_header
    raise _wrap_exception(ex, AzureSigningError) from ex
azure.storage.blob._shared.authentication.AzureSigningError: Invalid base64-encoded string: number of data characters (2049) cannot be 1 more than a multiple of 4
Am I doing the credentials wrong?
n
I don't have a quick answer, found some related thread: • https://stackoverflow.com/questions/71129515/azure-datalake-python-error-invalid-base64-encoded-string-number-of-data-chara adfls (fsspec Azure extension) is the underlying library that consume this credentials: https://github.com/fsspec/adlfs
Copy code
import pandas as pd

storage_options={'account_name': ACCOUNT_NAME, 'account_key': ACCOUNT_KEY}

ddf = pd.read_csv('abfs://{CONTAINER}/{FOLDER}/*.csv', storage_options=storage_options)
I took this example and modify from adlfs README, can you try that and see if this works? This would help to diagnose whether the issue is coming from fsspec or kedro
t
Thanks a lot for your quick response Nok! I am using a provisioned AzureML service, and therefore I do not have authorization to even open the "Access keys" tab from the datastore. Any other way to get an account key?
n
@Thomas d'Hooghe
Copy code
azure_credentials:
  account_name: "name_datastore" (gotten from Data-container page in Azure AI Machine Learning Studio)
  account_key: "eyJ...." (gotten from `az account get-access-token` in a Compute instance in Azure AI Machine learning studioe)
You show this snippets above so I thought you would have these account name and key already?
t
@Nok Lam Chan yes, I assumed that the access token could be used as account_key, but I am not fully sure, that is why I raised it. Can you tell what should be used as account_key?
n
Sorry I don't think I can help much regard to anything Azure specific as I don't have access of an instance and it always depends on your own policy, maybe someone who setup that in your team will have better idea. If you have the access keys tab maybe you can see the connection string? it seems matching the docs linked below. https://learn.microsoft.com/en-us/azure/storage/common/storage-account-keys-manage?tabs=azure-portal