Hi is there a way to organize functions of nodes py as it in Kedro #questions

Hi, is there a way to organize functions of nodes....

Ana Paula Rojas

01/08/2024, 6:45 PM

Hi, is there a way to organize functions of nodes.py as it increases complexity? I don't know if it follows best practices to separate the files into mutiple ones (for example nodes_something.py, nodes_othersomething.py)

Erwin

01/08/2024, 7:05 PM

Copy code

src
|--- your_project
|    |--- core
|    |--- kedro_utils
|    |--- pipelines

In certain projects, I've observed a practice where the business logic is extracted and stored in the "core" folder. The structure looks something like this, with the idea being to keep business logic separate from the folders containing pipelines and nodes. This way, nodes essentially become wrappers for the logic stored within the "core" folder. It also helps to try to code in a kedro agnostic way (not totally true)

👍 1

Deepyaman Datta

01/08/2024, 10:46 PM

nodes.py

is generated on a per-pipeline basis, so maybe you could consider breaking down your pipeline, if you think it's too big and can be broken into logical chunks. This is often the case IMO if some team just creates

data_engineering

and

data_science

pipeline. 2. It's just Python code; you can break it packages/submodules however makes sense; there's not necessarily a right way of doing it. Generating a

nodes.py

is just a simple structure, but you don't need to use

nodes_

prefix or anything. 3. At some point, projects can have a lot of code. It's not necessarily a bad thing for files to get large in the end.

👍 1

Ana Paula Rojas

01/09/2024, 3:25 PM

Thanks for your comments 🙂

Chris Schopp

01/10/2024, 11:46 AM

In my `nodes.py`s, I name helper functions with a preceding underscore to indicate they are "private" and should not be used in a pipeline (like a node) but should only be called by a node.

4 Views

Open in Slack

Previous Next