Hi all, I had a question about the following updat...
# questions
r
Hi all, I had a question about the following update for Polars datasets (https://github.com/kedro-org/kedro-plugins/issues/625). • Do we know when this implementation will happen? • In the meanwhile, how you would recommend solving this issue? ◦ I am trying to read parquet files stored on S3 that were written by spark and so need to use glob matching for it to work. Should we create a custom dataset?
r
Hi Raghav, Thank you for the patience. This seems to be a reported issue. I will move the issue to wizard inbox. We will tackle it in upcoming sprints. Related issues - https://github.com/kedro-org/kedro-plugins/issues/590 , https://github.com/kedro-org/kedro-plugins/issues/625) fyi @Dmitry Sorokin Raghav's team want to use polars for their use-case Thank you
K 2
👍 2
m
I already have such an implementation as a custom dataset. If you want, I can contribute it somewhere in November…
👍 1
As far as answering your question goes, you should already be able to do something similar with the current implementation of lazy dataset
r
@Matthias Roels Can you share your implementation so I can use it as a custom dataset in the meanwhile? I was thinking about using the Lazy Dataset, but would need to rewrite the ETL pipeline to rewrite our current pivot step since Lazy doesn’t support pivots Found a solution to the pivot step, I can switch to Lazy for now
👍 2
d
Hey @Raghav Singh, thanks for flagging this - we’ll aim to prioritise a new spike on the topic in the next backlog groomingsession: https://github.com/kedro-org/kedro-plugins/issues/625#issuecomment-3467317193
thankyou 1