https://kedro.org/ logo
#questions
Title
# questions
a

Akash Agnihotri

10/23/2023, 10:42 AM
Hey Team, I have query regarding Custom Datasets, we are implementing a custom dataset using
AbstractDataSet
to save and load torch object. When we use this dataset as part of a kedro pipeline, the node does not create local folder when saving the data ? Question is, do we need to implement this functionality ourself in our custom dataset, or is it somehting provided by
kedor
or ``AbstractDataSet` ?
d

Deepyaman Datta

10/23/2023, 11:05 AM
You would need to do it yourself. If you look at many (most?) dataset implementations (e.g. pandas.CSVDataset, tensorflow.TensorFlowModelDataset), you'll see the use of fsspec to do thus. What does your _save method look like?
a

Akash Agnihotri

10/23/2023, 5:29 PM
Thanks @Deepyaman Datta, I am trying save torch data.
Copy code
def _save(self, data: pl.LightningDataModule) -> None:
        """Save data module to filepath."""
        with open(self.filepath, "wb+") as file:
            torch.save(data, file)
d

Deepyaman Datta

10/23/2023, 6:12 PM
I would look at the other implementations, and replicate the use of
fsspec
. As mentioned, directories aren't created magically by
AbstractDataset
, but IIRC
fsspec
implementations will handle the create (which is part of why we use it). Also, it will enable use on other filesystems, like if you want to save to s3.
8 Views