Hey everyone I have a few kedro nodes that run a bit long an Kedro #questions

Hey everyone. I have a few kedro nodes that run a ...

Madana Krishnan

08/09/2023, 6:23 PM

Hey everyone. I have a few kedro nodes that run a bit long and I am separating them into functions (ex: a node has 1 or 2 helper functions). I do not want to split into nodes since it doesn’t fit my overall logic flow. As a best practice, is it good to have a separate utils/ folder for these tiny helper functions? Or is it advised to just use a nested function? Thanks! 🙂

datajoely

08/09/2023, 6:25 PM

I would typically want my Kedro nodes to be calling functions defined in other files/packages

datajoely

08/09/2023, 6:25 PM

it makes it much easier to test units of logic decoupled from your pipeline logic

Madana Krishnan

08/09/2023, 6:29 PM

I see, that is a great point.

datajoely

08/09/2023, 6:29 PM

I also see nothing wrong with tiny nodes

datajoely

08/09/2023, 6:30 PM

design for readability even if that person is you in 6 months

Madana Krishnan

08/09/2023, 6:31 PM

Makes sense. Currently a few nodes run 30+ lines. Will think through this 🙂

Madana Krishnan

08/09/2023, 6:34 PM

And you mentioned testing the logic for nodes and pipelines. I am currently writing tests for each node (unit test-ish). For pipeline, is there a different way of testing (kind of an integrate test)?

Nok Lam Chan

08/09/2023, 8:04 PM

You can run integration test with KedroSession, the easiest way is to create a test environment and use a subset of the dataset (or mock one) to run the pipeline end to end.

Open in Slack

Previous Next