https://kedro.org/ logo
#questions
Title
# questions
r

Rachid Cherqaoui

07/17/2023, 9:04 PM
Hi, I'm doing some unit tests on my project and I'm using the
PartitionedDataSet
function from
<http://kedro.io|kedro.io>
to load a data but I've just seen that this function doesn't take the delimiter into account, how can I solve this? (knowing that I'm working on csv files on my local, here is the code used :
data_set = PartitionedDataSet(
Copy code
path = "data/01_raw/Tableaux",
                dataset= CSVDataSet,
                filename_suffix= ".csv",
                load_args= {"delimiter": ";", "header": 0,"encoding": "utf-8"}
d

Deepyaman Datta

07/17/2023, 9:17 PM
Your
load_args
should be on the underlying dataset, not on the
PartitionedDataset
. See https://github.com/kedro-org/kedro/blob/main/kedro/io/partitioned_dataset.py#L166-L167. Pass a dict, if you're doing it in Python and not YAML.
r

Rachid Cherqaoui

07/17/2023, 9:20 PM
thanks for your response, but in the PartitionedDataSet function, the dataset argument must be a string and not a dictionary
d

Deepyaman Datta

07/17/2023, 9:30 PM
Copy code
>>> pds = PartitionedDataset(path="kedro/", dataset={"type": "pandas.CSVDataSet", "load_args": {"delimiter": ";"}}, filename_suffix=".csv")
>>> pds.load()
{'blah': <bound method AbstractVersionedDataSet.load of <kedro_datasets.pandas.csv_dataset.CSVDataSet object at 0x149d97850>>, 'foo': <bound method AbstractVersionedDataSet.load of <kedro_datasets.pandas.csv_dataset.CSVDataSet object at 0x149d97940>>}
>>> pds.load()["blah"]()
   dog  eat  dog.1
0    1    2      3
1    4    5      6
>>> pds.load()["foo"]()
   cats  eat  mice
0     5    6     7
1     8    9     0
>>>
(kedro) deepyaman@Deepyamans-MacBook-Air kedro % cat kedro/*.csv
dog;eat;dog
1;2;3
4;5;6
cats;eat;mice
5;6;7
8;9;0
🥳 1
r

Rachid Cherqaoui

07/17/2023, 9:30 PM
I've been able to correct the problem, thank you very much.
👍 1