Hello team. I have created a custom Class that inh...
# questions
v
Hello team. I have created a custom Class that inherits from sklearn BaseEstimator, TransformerMixin. So
class CustomClass((BaseEstimator, TransformerMixin)
I have created an object of that class and saved it to my Kedro catalog as a pickle object. Now the problem is when I try using
catalog.load()
on a pipeline to load that object I get the following error:
DataSetError: Failed while loading data from data set PickleDataSet(backend=pickle,
filepath=……./data/06_models/custom_model_V1.pkl,
load_args={}, protocol=file, save_args={}).
Can’t get attribute ‘CustomClass’ on <module ‘__main__’ from ‘……venv/bin/kedro’>
I was able to make it work on a notebooks by first importint the class from the py file where it was defined:
from custom_classes import CustomClass
But when runing a kedro pipeline that uses this object as an input loaded from the catalog adding the import at the top of the pipeline fill did not fix it. Any usggestions on how to fix this?
d
hello @Valentin Martinez Gama so I’m not exactly sure what’s going on - but I’ve got a couple of hunches: • There are multiple pickle backends - joblib, dill and cloudpickle sometimes do better at handling more exotic objects • Given this is a filepath issue I wonder if you want to tweak the execution directory in pyproject.toml • Last question - are you missing any
__intit__.py
files?
b
if what @datajoely recommended doesn't work, you could always create a custom dataset for your specific object and import the class there, just as you do in the notebook