Hi, is there a way to organize functions of nodes....
# questions
a
Hi, is there a way to organize functions of nodes.py as it increases complexity? I don't know if it follows best practices to separate the files into mutiple ones (for example nodes_something.py, nodes_othersomething.py)
e
Copy code
src
|--- your_project
|    |--- core
|    |--- kedro_utils
|    |--- pipelines
In certain projects, I've observed a practice where the business logic is extracted and stored in the "core" folder. The structure looks something like this, with the idea being to keep business logic separate from the folders containing pipelines and nodes. This way, nodes essentially become wrappers for the logic stored within the "core" folder. It also helps to try to code in a kedro agnostic way (not totally true)
👍 1
d
1.
nodes.py
is generated on a per-pipeline basis, so maybe you could consider breaking down your pipeline, if you think it's too big and can be broken into logical chunks. This is often the case IMO if some team just creates
data_engineering
and
data_science
pipeline. 2. It's just Python code; you can break it packages/submodules however makes sense; there's not necessarily a right way of doing it. Generating a
nodes.py
is just a simple structure, but you don't need to use
nodes_
prefix or anything. 3. At some point, projects can have a lot of code. It's not necessarily a bad thing for files to get large in the end.
👍 1
a
Thanks for your comments 🙂
c
In my `nodes.py`s, I name helper functions with a preceding underscore to indicate they are "private" and should not be used in a pipeline (like a node) but should only be called by a node.