Matthias Roels
02/06/2025, 8:57 PMHall
02/06/2025, 8:57 PMMatthias Roels
02/06/2025, 8:58 PMconf/
, i.e. every country is a kedro_env
. Hence, splitting the project based on country won’t reduce the size and complexity. The only logical split I see is based on grouping certain pipelines and nodes and moving those to separate kedro projects, roughly with the following dependencies
— B — C
A—|
— D — E
These projects would share some code but that’s not too bad the handle. The biggest challenge is that would they share an extensive amount of config. So a change to a param would then need to be replicated in several projects. Any advice on how you would solve this would be awesome.Alexandre Ouellet
02/06/2025, 9:03 PMAlexandre Ouellet
02/06/2025, 9:04 PMAlexandre Ouellet
02/06/2025, 9:10 PMRavi Kumar Pilla
02/06/2025, 9:16 PMbefore_pipeline_run
. But as @Alexandre Ouellet mentioned, kedro dataset factories are a great way to reduce the config complexity. Thank youMarc Solomon
02/06/2025, 9:50 PMcomponents
. With Alloy there's also the concept of apps
, these are .yml files which declare combinations of components
and relevant config and metadata.
We then use Alloy to assemble apps
as needed for different contexts. Often different regions, or business units have their own app which is a specific combination of components relevant to them. By managing the code in this way there's really clear separations of concern and it's super easy to scale development.
More info: https://medium.com/quantumblack/engineering-solutions-for-reuse-1ff5a81d8611Alexandre Ouellet
02/06/2025, 9:56 PMJuan Luis
02/07/2025, 8:30 AMPartitionedDataset
work for you? (feel free to open a separate thread to keep this one focused)