https://kedro.org/ logo
#questions
Title
# questions
s

Sid Shetty

07/26/2023, 5:24 PM
Hello team, when I split a pandas dataframe and store using partitioned dataset, loading them back together appears to find schema differences. Since a few columns have
nulls
. Is there any workaround here that avoids me having to add another node to put these partitions together and ideally just read as a pandas.ParquetDataSet? Perhaps passing the schema of the original dataframe or even specifying it explicitly?
j

Juan Luis

07/26/2023, 5:29 PM
@Sid Shetty you can add
load_args
to your dataset to control how to
pd.read_parquet
will be used, these will get passed directly
s

Sid Shetty

07/26/2023, 5:41 PM
Ahhh thank you,
use_pandas_metadata
seems like just what I was looking for!
🥳 1