Hello! Is there any functionality that allows to r...
# questions
r
Hello! Is there any functionality that allows to run Kedro pipelines in debug?
d
what are you trying to debug? We have instructions on how to set up the debuggers for VS Code or PyCharm here
j
nice! @datajoely do these instructions apply for Kedro users? or are they more for development? I think what @Rebecca Solcia meant was "how to use the interactive debugging capabilities of one's IDE to run a Kedro pipeline" or in other words, when a node is not working properly, be able to peek into it and understand what's going wrong
d
I guess it depends on how @Rebecca Solcia is running the pipeline - how are you doing so today?
the other option is to use the ol’
breakpoint()
global
r
I’m using PyCharm currently
d
in which case, I’d use the native debugger and breakpoints there
r
Wonderful, I’ll have a look at the
breakpoint()
to see what’s best for me
d
PyCharm is way better
but
breakpoint()
since python 3.7 has been super easy for quick debugging
r
The only thing I noticed is that for my 230.7 MB table the PyCharm debugger is taking ages to load the input table
I don’t know if it’s on Kedro’s side, server side or PyCharm
d
230.7 MB is pretty large. It should be fine for pandas to handle, as long as you haven't further exploded it through transformations (you could look at the size of the object in memory, if 230.7 isn't the current size). Not sure about PyCharm's handling. Kedro shouldn't really have any opinion here; it's just calling whatever processing framework under the hood, and doesn't care what size your data is per se.
n
Debugging in Pycharm with large pipelines sometimes take long time, even with simple operation like
data.head()
. I don’t know if things have changed. Often I just load that up in a notebook and run it line by line.