As a particular modular pipeline becomes more complex, I'm thinking of breaking up its
into multiple files in a
directory. For example,
would have
with the functions that will be used in nodes and their private helper functions. And in
I'd import the functions from
My question is, if I am wanting to do the above, should I just be breaking this up into multiple modular pipelines? The modular pipeline's purpose is to produce an input to a simulation, but the simulation's input is becoming more "refined" over time. So it makes sense to me to keep it as one modular pipeline but am curious how others approach this.
So the functions referenced by nodes in the modular pipeline can live anywhere, I feel the
we generate is a suggestion of where to get started…. but not a mandate
as projects grow in sophisticated we actually recommend that your business logic live in formal python packages / libraries outside of your project so they can be maintained, documented and tested without tight coupling to your pipeline
as for what’s recommended - it’s hard to say in general terms. I try to live by - ‘write code for someone else to read, even if that person is future you’. So look to organise your code in a way that feel readable, understandable and is easy to onboard a new team member
