Hello everyone, I'm trying to construct a node in ...
# questions
e
Hello everyone, I'm trying to construct a node in kedro which takes as input a GBQTableDataset , which is rather large (few millions of rows and 150 columns ), loads it as a dataframe and execute some pandas/sklearn operations on it. My problem is that the BigQuery table is too large and the loading fails . What would you suggest to use? I was thinking of creating a custom dataset but that will re-use the GBQTableDataset code , adapting the loading part. But I'm not exactly sure how. Thanks in advance for your guidance on this topic.
d
Hi Emilie, have you tried https://pola.rs/?
👍 1
d
Yeah I would try chunking, polars or spark if you’re hitting a wall
👍 3
e
Hello, Thank you for your replies. I will look at those options.