https://kedro.org/ logo
#questions
Title
# questions
m

Madana Krishnan

08/09/2023, 6:23 PM
Hey everyone. I have a few kedro nodes that run a bit long and I am separating them into functions (ex: a node has 1 or 2 helper functions). I do not want to split into nodes since it doesn’t fit my overall logic flow. As a best practice, is it good to have a separate utils/ folder for these tiny helper functions? Or is it advised to just use a nested function? Thanks! 🙂
d

datajoely

08/09/2023, 6:25 PM
I would typically want my Kedro nodes to be calling functions defined in other files/packages
it makes it much easier to test units of logic decoupled from your pipeline logic
m

Madana Krishnan

08/09/2023, 6:29 PM
I see, that is a great point.
d

datajoely

08/09/2023, 6:29 PM
I also see nothing wrong with tiny nodes
design for readability even if that person is you in 6 months
m

Madana Krishnan

08/09/2023, 6:31 PM
Makes sense. Currently a few nodes run 30+ lines. Will think through this 🙂
And you mentioned testing the logic for nodes and pipelines. I am currently writing tests for each node (unit test-ish). For pipeline, is there a different way of testing (kind of an integrate test)?
n

Nok Lam Chan

08/09/2023, 8:04 PM
You can run integration test with KedroSession, the easiest way is to create a test environment and use a subset of the dataset (or mock one) to run the pipeline end to end.