# questions

Emilie Gourmelen

12/22/2023, 8:25 AM
Hello everyone, I'm trying to construct a node in kedro which takes as input a GBQTableDataset , which is rather large (few millions of rows and 150 columns ), loads it as a dataframe and execute some pandas/sklearn operations on it. My problem is that the BigQuery table is too large and the loading fails . What would you suggest to use? I was thinking of creating a custom dataset but that will re-use the GBQTableDataset code , adapting the loading part. But I'm not exactly sure how. Thanks in advance for your guidance on this topic.

Dmitry Sorokin

12/22/2023, 9:51 AM
Hi Emilie, have you tried
👍 1


12/22/2023, 10:39 AM
Yeah I would try chunking, polars or spark if you’re hitting a wall
👍 3

Emilie Gourmelen

12/22/2023, 4:05 PM
Hello, Thank you for your replies. I will look at those options.