I m working through the tutorial and at the stage <https doc Kedro #questions

I'm working through the tutorial and at the stage ...

G. D. McBain

10/23/2024, 2:52 AM

I'm working through the tutorial and at the stage Preprocessed data registration found "Kedro saves these outputs in Parquet format" like `data/02_intermediate/preprocessed_companies.pq`; i.e., with a

.pq

suffix. I tried to open this with VS Code's Data Wrangler extension but it didn't recognize it. If I copied the file and changed it to

.parquet

, DataWrangler was happy, so the file's O. K., it's just the suffix. What should be done here? Is `.pq`a recognized suffix that VS Code's Data Wrangler should recognize? Or should we be saving as

.parquet

? Obviously not a showstopper, but I thought I'd take the opportunity to say hello and #C03RKNSN3U0.

William Caicedo

10/23/2024, 2:54 AM

Hi,

.parquet

is fine and what I guess most of us use

G. D. McBain

10/23/2024, 2:59 AM

Oh, yes,

.parquet

is fine, I know, but the tutorial is generating

.pq

, which isn't. Should the tutorial be revised to generate `.parquet`instead or should Data Wrangler be taught that

.pq

is O. K. too?

Juan Luis

10/23/2024, 6:00 AM

hmmm https://code.visualstudio.com/docs/datascience/data-wrangler#_launch-data-wrangler-directly-from-a-file

Juan Luis

10/23/2024, 6:04 AM

I did a quick 3 minute exploration online and every single reference I see uses

.parquet

rather than

.pq

indeed

Juan Luis

10/23/2024, 6:05 AM

@G. D. McBain do you want to open an issue on https://github.com/kedro-org/kedro/issues/ stating this?

Juan Luis

10/23/2024, 6:05 AM

and thanks for letting us know btw!

G. D. McBain

10/23/2024, 10:30 PM

Sure, that 3-minute finding matches mine too. Before that I had started drafting an issue for Data Wrangler, but I'm thinking it's better to change here rather than there. Ta.

G. D. McBain

10/24/2024, 1:06 AM

#4253 .parquet not .pq By the way though, I didn't have to take many more steps through the tutorial before finding that Kedro-Viz meant that I didn't need to resort to Data Wrangler to get a glimpse inside the Parquet. Very nice!

🥳 1

2 Views

Open in Slack

Previous Next