Marc Gris
11/06/2023, 9:27 AMDatasetError: <class 'pandera.api.pandas.container.DataFrameSchema'> was not serialised due to: Can't pickle <function custom_check_is_valid_barcode at 0x11a461310>:
it's not the same object as data_validation.pipelines.validate.nodes.custom_check_is_valid_barcode
Thx in advance,
Regards
MBen Horsburgh
11/06/2023, 9:33 AMDataFrameSchema
class might contain a lambda
function? Or similar. If so, such functions are not pickle-able.DataFrameSchema
- if you see any lambda
functions then refactor them into concrete functions and see if it helpsMarc Gris
11/06/2023, 9:34 AM@pa.extensions.register_check_method()
def custom_check_is_valid_barcode(barcodes: "pd.Series[str]",
check: bool) -> "pd.Series[bool]":
def check_barcode_validity(row):
return barcodenumber.check_code(row.BARCODE_TYPE, row.BARCODE)
if not isinstance(check, bool):
raise ValueError(f"`check` should be `bool` not {type(check)}")
barcodes = barcodes.to_frame()
barcodes.columns = ['BARCODE']
barcodes['BARCODE_TYPE'] = barcodes.fillna('').map(identify_barcode)
if check:
barcodes['IS_VALID'] = barcodes.apply(check_barcode_validity, axis=1)
else:
barcodes['IS_VALID'] = True
return barcodes['IS_VALID']
Ben Horsburgh
11/06/2023, 9:46 AMcheck_barcode_validity
outside of the decorated function, so that it is a module-level functionMarc Gris
11/06/2023, 9:47 AMBen Horsburgh
11/06/2023, 9:49 AMMarc Gris
11/06/2023, 9:51 AMkedro-pandera
I thought that you might have some insight into the above.
Many thanks in advance,
Regards
MarcYolan Honoré-Rougé
11/06/2023, 12:47 PMMarc Gris
11/06/2023, 12:50 PM