Hey Team, I have query regarding Custom Datasets, ...
# questions
a
Hey Team, I have query regarding Custom Datasets, we are implementing a custom dataset using
AbstractDataSet
to save and load torch object. When we use this dataset as part of a kedro pipeline, the node does not create local folder when saving the data ? Question is, do we need to implement this functionality ourself in our custom dataset, or is it somehting provided by
kedor
or ``AbstractDataSet` ?
d
You would need to do it yourself. If you look at many (most?) dataset implementations (e.g. pandas.CSVDataset, tensorflow.TensorFlowModelDataset), you'll see the use of fsspec to do thus. What does your _save method look like?
a
Thanks @Deepyaman Datta, I am trying save torch data.
Copy code
def _save(self, data: pl.LightningDataModule) -> None:
        """Save data module to filepath."""
        with open(self.filepath, "wb+") as file:
            torch.save(data, file)
d
I would look at the other implementations, and replicate the use of
fsspec
. As mentioned, directories aren't created magically by
AbstractDataset
, but IIRC
fsspec
implementations will handle the create (which is part of why we use it). Also, it will enable use on other filesystems, like if you want to save to s3.