Massinissa Saïdi
03/31/2023, 12:29 PMDataSetError: <class 'sklearn.pipeline.Pipeline'> was not serialised due to: Can't pickle local object 'fit_best_model.<locals>.<lambda>'
I just return a partitioned pickle dataset like that return {'model_' + parameters['model']: pipeline}
and I define the dataset in catalog.yml like that
models_partionned:
type: PartitionedDataSet
path: data/06_models/${date}/${target}/
filename_suffix: ".pkl"
dataset:
type: pickle.PickleDataSet
marrrcin
03/31/2023, 12:34 PMMassinissa Saïdi
03/31/2023, 12:34 PMDeepyaman Datta
03/31/2023, 12:47 PMfit_best_model
?Massinissa Saïdi
03/31/2023, 12:51 PMDeepyaman Datta
03/31/2023, 12:52 PMlambda
in there?Massinissa Saïdi
03/31/2023, 12:53 PMTfidfVectorizer(tokenizer=lambda x: x.split(' '),...
Deepyaman Datta
03/31/2023, 12:56 PMfrom operator import methodcaller
TfidfVectorizer(tokenizer=methodcaller('split', ' '),...
Massinissa Saïdi
03/31/2023, 12:56 PMDeepyaman Datta
03/31/2023, 12:58 PM' '
argument to split
? By default, split
already will separate based on any run of whitespace. Unless you really need it to split on single space.
2. models_partionned
is spelled wrong (if it's English) 😉Massinissa Saïdi
03/31/2023, 1:00 PM