https://kedro.org/ logo
#questions
Title
# questions
r

Rebecca Solcia

03/09/2023, 4:55 PM
Hello! Is there any functionality that allows to run Kedro pipelines in debug?
d

datajoely

03/09/2023, 4:57 PM
what are you trying to debug? We have instructions on how to set up the debuggers for VS Code or PyCharm here
j

Juan Luis

03/09/2023, 4:59 PM
nice! @datajoely do these instructions apply for Kedro users? or are they more for development? I think what @Rebecca Solcia meant was "how to use the interactive debugging capabilities of one's IDE to run a Kedro pipeline" or in other words, when a node is not working properly, be able to peek into it and understand what's going wrong
d

datajoely

03/09/2023, 4:59 PM
I guess it depends on how @Rebecca Solcia is running the pipeline - how are you doing so today?
the other option is to use the ol’
breakpoint()
global
r

Rebecca Solcia

03/09/2023, 5:00 PM
I’m using PyCharm currently
d

datajoely

03/09/2023, 5:00 PM
in which case, I’d use the native debugger and breakpoints there
r

Rebecca Solcia

03/09/2023, 5:00 PM
Wonderful, I’ll have a look at the
breakpoint()
to see what’s best for me
d

datajoely

03/09/2023, 5:01 PM
PyCharm is way better
but
breakpoint()
since python 3.7 has been super easy for quick debugging
r

Rebecca Solcia

03/09/2023, 5:01 PM
The only thing I noticed is that for my 230.7 MB table the PyCharm debugger is taking ages to load the input table
I don’t know if it’s on Kedro’s side, server side or PyCharm
d

Deepyaman Datta

03/09/2023, 5:15 PM
230.7 MB is pretty large. It should be fine for pandas to handle, as long as you haven't further exploded it through transformations (you could look at the size of the object in memory, if 230.7 isn't the current size). Not sure about PyCharm's handling. Kedro shouldn't really have any opinion here; it's just calling whatever processing framework under the hood, and doesn't care what size your data is per se.
n

Nok Lam Chan

03/09/2023, 5:49 PM
Debugging in Pycharm with large pipelines sometimes take long time, even with simple operation like
data.head()
. I don’t know if things have changed. Often I just load that up in a notebook and run it line by line.
7 Views