https://kedro.org/ logo
#questions
Title
# questions
c

Camilo Piñón

02/06/2024, 8:01 AM
Hey team! Is there any way to store a LazyPolarsDataset (with parquet format) using the sink_parquet polars method (maybe via save_args in the Catalog?)?. Currently checking the docs and the implementation of the kedro dataset, but it is not very clear to me 🙁. • Docs say: save_args (*`Optional`*[*`dict`*[
str
,
Any
]]) – Polars options for saving files. Here you can find all available arguments: https://pola-rs.github.io/polars/py-polars/html/reference/io.html All defaults are preserved. But when I go to the reference, I don't see a clear mapping between it and the save_args parameter. Thank you!
j

Juan Luis

02/06/2024, 8:23 AM
good question @Camilo Piñón. there's an issue that's tangentially related: https://github.com/kedro-org/kedro-plugins/issues/519 that initially was about adding
streaming=True
for
collect
, but I'm thinking that maybe we should use
sink
instead
I'm leaving a comment there
d

datajoely

02/06/2024, 8:24 AM
So much like the polars library itself our dataset is evolving rapidly any help designing it to be as useful as possible is massively appreciated. In the short term subclassing and extending our implementation will unblock you
3 Views