Jonghyun Yun
08/07/2024, 7:22 PMDeepyaman Datta
08/07/2024, 8:29 PMmy_data = pd.DataFrame(...)
reloaded = my_data
Of course, this is "correct".
Using an intermediate file is like:
my_data = pd.DataFrame(...)
my_data.to_csv("path/to/file.csv", **save_args)
reloaded = pd.read_csv("path/to/file.csv", **load_args)
Kedro tries to set stuff for things like whether to read/write index column, etc. reasonably, but there can still be inconsistency. For example, null types may be difficult to distinguish from empty strings, unless you configure these.
Formats like Parquet definitely help in this regard, but still may not be perfect.Jonghyun Yun
08/07/2024, 8:37 PMYury Fedotov
08/08/2024, 4:01 AMpickle
and parquet
Nok Lam Chan
08/09/2024, 9:51 PMNok Lam Chan
08/09/2024, 9:53 PM