Yetunde
03/06/2023, 12:28 PMPasteur is a library for performing privacy-aware end-to-end data synthesis. Gather your raw data and preprocess, synthesize, and evaluate it within a single project. Use the tools you're familiar with: numpy, pandas, scikit-learn, scipy or any other. When your dataset grows, scale to out-of-core data by using Pasteur's parallelization and partitioning primitives, without code changes or using different libraries.
This use of Kedro is quite creative: https://github.com/pasteur-dev/pasteurJuan Luis
03/06/2023, 1:23 PMpasteur new --starter=pasteur
🙈Yetunde
03/06/2023, 1:27 PMdatajoely
03/06/2023, 1:29 PM