Ana Paula Rojas
01/08/2024, 6:45 PMErwin
01/08/2024, 7:05 PMsrc
|--- your_project
| |--- core
| |--- kedro_utils
| |--- pipelines
In certain projects, I've observed a practice where the business logic is extracted and stored in the "core" folder. The structure looks something like this, with the idea being to keep business logic separate from the folders containing pipelines and nodes. This way, nodes essentially become wrappers for the logic stored within the "core" folder.
It also helps to try to code in a kedro agnostic way (not totally true)Deepyaman Datta
01/08/2024, 10:46 PMnodes.py
is generated on a per-pipeline basis, so maybe you could consider breaking down your pipeline, if you think it's too big and can be broken into logical chunks. This is often the case IMO if some team just creates data_engineering
and data_science
pipeline.
2. It's just Python code; you can break it packages/submodules however makes sense; there's not necessarily a right way of doing it. Generating a nodes.py
is just a simple structure, but you don't need to use nodes_
prefix or anything.
3. At some point, projects can have a lot of code. It's not necessarily a bad thing for files to get large in the end.Ana Paula Rojas
01/09/2024, 3:25 PMChris Schopp
01/10/2024, 11:46 AM