I had a question let's say we are working in a big repo (a.k.a a monorepo). Each module is a kedro pipeline and I want to combine these pipeline into a master pipeline (also each module has different dependencies). How do I go on about creating a master pipeline ?
11/13/2023, 2:26 AM
What do you mean, each module is a Kedro pipeline? In my experience, would structure the monorepo itself as a large Kedro project, and each module is a micropackage/modular pipeline.
• If you want to run everything in the same environment, your dependencies must be resolvable across the set of micropackages you're combining in your master pipeline. You should keep in mind to specify dependencies loosely, as you're essentially authoring a bunch of packages that need to coexist (i.e. don't pin all your dependencies).
• If you want to run things using an orchestrator, then that defines your deployment, but resolution isn't an issue since each pipeline is separate from the other/in different containers.
My experience doing this is mostly from a couple years ago (for very large Kedro monorepos), so definitely see if you get some more recent thinking. 🙂
11/13/2023, 11:11 AM
Is there something that could help with such an orchestration? Like a kedro for kedro 🤔
11/13/2023, 1:59 PM
@Lukas Innig there is something like that for creating/managing Kedro monorepos (e.g. for multiple use cases, or use across an organization) within QuantumBlack/McKinsey; don't know of anything open source.