Hello Is there any functionality that allows to run Kedro pi Kedro #questions

Join Slack

Hello! Is there any functionality that allows to r...

# questions

Rebecca Solcia

03/09/2023, 4:55 PM

Hello! Is there any functionality that allows to run Kedro pipelines in debug?

datajoely

03/09/2023, 4:57 PM

what are you trying to debug? We have instructions on how to set up the debuggers for VS Code or PyCharm here

Juan Luis

03/09/2023, 4:59 PM

nice! @datajoely do these instructions apply for Kedro users? or are they more for development? I think what @Rebecca Solcia meant was "how to use the interactive debugging capabilities of one's IDE to run a Kedro pipeline" or in other words, when a node is not working properly, be able to peek into it and understand what's going wrong

datajoely

03/09/2023, 4:59 PM

I guess it depends on how @Rebecca Solcia is running the pipeline - how are you doing so today?

datajoely

03/09/2023, 5:00 PM

the other option is to use the ol’

breakpoint()

global

Rebecca Solcia

03/09/2023, 5:00 PM

I’m using PyCharm currently

datajoely

03/09/2023, 5:00 PM

in which case, I’d use the native debugger and breakpoints there

Rebecca Solcia

03/09/2023, 5:00 PM

Wonderful, I’ll have a look at the

breakpoint()

to see what’s best for me

datajoely

03/09/2023, 5:01 PM

PyCharm is way better

datajoely

03/09/2023, 5:01 PM

but

breakpoint()

since python 3.7 has been super easy for quick debugging

Rebecca Solcia

03/09/2023, 5:01 PM

The only thing I noticed is that for my 230.7 MB table the PyCharm debugger is taking ages to load the input table

Rebecca Solcia

03/09/2023, 5:01 PM

I don’t know if it’s on Kedro’s side, server side or PyCharm

Deepyaman Datta

03/09/2023, 5:15 PM

230.7 MB is pretty large. It should be fine for pandas to handle, as long as you haven't further exploded it through transformations (you could look at the size of the object in memory, if 230.7 isn't the current size). Not sure about PyCharm's handling. Kedro shouldn't really have any opinion here; it's just calling whatever processing framework under the hood, and doesn't care what size your data is per se.

Nok Lam Chan

03/09/2023, 5:49 PM

Debugging in Pycharm with large pipelines sometimes take long time, even with simple operation like

data.head()

. I don’t know if things have changed. Often I just load that up in a notebook and run it line by line.

7 Views

Open in Slack

Previous Next