Hello Do you have any tips for debugging nodes and functions Kedro #questions

Hello! Do you have any tips for debugging nodes an...

Afiq Johari

10/17/2023, 2:46 PM

Hello! Do you have any tips for debugging nodes and functions in Kedro? Here's what I'm trying to do: I want to make incremental updates to some functions as defined in

nodes.py

and then verify their output. Typically, these

functions

rely on data specified in the

catalog.yml

as parameters. I'm currently using Kedro's IPython environment, which allows me to load data using

catalog.load('datasetname')

. However, I find it a bit confusing to figure out how to run the functions I've defined for a specific pipeline. I use

%reload_kedro

to refresh my Kedro IPython environment. I'm aware that it's possible to run nodes (slicing them), but I'm wondering if it's also possible to run only a specific function before fully defining the nodes. I'd greatly appreciate any insights or best practices for carrying out these incremental updates effectively.

Nok Lam Chan

10/17/2023, 2:49 PM

if it’s also possible to run only a specific function before fully defining the nodes.

What do you mean?

Hugo Rebeix

10/17/2023, 3:14 PM

Well if you want to test your nodes before defining them in the pipeline then I don't understand what's the problem, you can test running a script or better, test it in the kedro notebook (that enables you to load the data, run your function and check the output without saving it). Once you're sure the outputs are as you like it you can then connect it to your pipeline. I personally prefer to connect first my nodes to my pipeline using class

MemoryDataset

as outputs and test-run them from the Kedro Notebook and inspect the results. I usually use it that way:

res = session.run(nodes=["node_im_testing"])

If the outputs are not as I want or if my node fails you can then debug the cell that contains the previous line with breakpoints inside your node. When I'm sure my node works I then connect the outputs to the catalog or to the next nodes, clear my notebook and I'm done 😉. Hope it helps!

Afiq Johari

10/17/2023, 3:16 PM

@Nok Lam Chan for example, I'm creating a new function process_three_data(dataA,dataB,dataC) in nodes.py under the data_processing folder I just want to run this specific function and update this function incrementally. Typically in a one script python file, I can just

Shift+Enter

the lines of code that I want and it will goes to the Python terminal. But when I'm within kedro iPython environment, I'm not sure how to do this kind of incremental changes to my codes and test them.

Afiq Johari

10/17/2023, 3:22 PM

@Hugo Rebeix right, I think I'll stick to the

kedro jupyter notebook

at the moment. But yeah, ideally, I want to update my functions within the nodes.py and use something like

Shift+Enter

to see the output of the function.

👍 1

Hugo Rebeix

10/17/2023, 3:25 PM

Yeah the bad thing with the session.run() method is that you have to restart the kernel between each run and or change to the node (or maybe the

%reload_kedro

enables to reload everything without restarting the kernel)

Nok Lam Chan

10/17/2023, 3:36 PM

%reload_kedro will do.

🚀 1

Nok Lam Chan

10/17/2023, 3:36 PM

I am not sure if this is a Kedro specific issue, seems that you are having problem with the IPython terminal?

Nok Lam Chan

10/17/2023, 3:39 PM

Typically in a one script python file, I can just
Shift+Enter
the lines of code that I want and it will goes to the Python terminal.

If I understand correctly, it won’t work if it’s a function? If you want to achieve the same in any interactive terminal/Jupyter. You have two choices: 1. Use a debugger (it’s built exactly for this purpose and have more advance feature) 2. Copy &paste the function and remove the indent etc.

Afiq Johari

10/17/2023, 3:41 PM

@Nok Lam Chan yes, I think I'll use the debugger or the jupyter notebook option at the moment. Thanks for the tips!

Nok Lam Chan

10/17/2023, 3:41 PM

I proposed to create a jupyter magic which does all these copy &pasting job and stitch them nicely in a Jupyter Notebook which allows similar workflow that you described. https://github.com/kedro-org/kedro/issues/1832

👍 1

Nok Lam Chan

10/17/2023, 3:41 PM

If you think this would be helpful please upvote and leave some comments there to help us prioritise.

Hugo Rebeix

10/17/2023, 3:42 PM

@Afiq Johari in Vscode you can use the debugger for a single cell of a notebook 😉

👍 1

🥳 1

5 Views

Open in Slack

Previous Next