Ed Henry
09/12/2023, 6:21 PMJuan Luis
09/12/2023, 6:28 PMEd Henry
09/12/2023, 6:29 PMDeepyaman Datta
09/12/2023, 6:38 PMEd Henry
09/12/2023, 6:46 PMs3fs
, etc.Marc Solomon
09/12/2023, 7:13 PMDeepyaman Datta
09/12/2023, 7:14 PMExcept when you start to abstract modules outside of pipelines to be reusable across pipelines, and that's where I'm hitting the dependency hell.This is a good point. In my past experience solving this problem, I added a
lib
directory inside of the project, where we built packages that could be used as dependencies by multiple pipelines. For example, lib/embedding_generator
could be used by pipelines/some_modeling_pipeline
and pipelines/another_modeling_pipeline
.
This did require a more complex CI/build process, where libs were built before pipelines. Also explored the use of monorepo tools (e.g. Pants, Bazel) for this, though never got around to it.
My design was heavily influenced by https://medium.com/opendoor-labs/our-python-monorepo-d34028f2b6fa, where instead of projects
you have Kedro pipelines
. This was also done before Kedro had a more generic micropackaging workflow, so now perhaps Kedro could also manage the libs
.Ed Henry
09/12/2023, 8:24 PMJuan Luis
09/13/2023, 7:49 AMkedro micropkg *
has less than 300 hits in our telemetry, out of 2.41M events, hence 0.01 %.
I hear you on the dependency issues though @Ed Henry, I think it's happening to other users with big projects (I recall @Marc Gris has spoken up about this in the recent past too). my recommendation would be, for now, to split the piplines across different Kedro projects and connect them through a common catalog. but it's an area we don't have lots of good recommendations to make - if you come up with nice usage patterns, we'd be glad to add those to our docs.Marc Gris
09/13/2023, 8:58 AMDeepyaman Datta
09/13/2023, 1:19 PMwe do have a micropackaging workflow https://docs.kedro.org/en/stable/nodes_and_pipelines/micro_packaging.html but if anything, it's more for packaging than for development. regardless, it's somewhat inconsistent with other parts of the library and needs some love, as well as users complaining about it - we are not seeing a lot of usage!As you point out, I agree it needs some love. The usage is a bit of chicken-and-egg problem; there is significant packaging/unpackaging of Kedro pipelines, etc. going on in some places, but they don't usehas less than 300 hits in our telemetry, out of 2.41M events, hence 0.01 %.kedro micropkg *
kedro micropkg
because it doesn't work as well as just using native Python packaging. But, the people who end up integrating custom Python packaging into their Kedro project/monorepos/tooling are also people who are relative experts when it comes to Python packaging (compared to 95+% of Kedro users).Iñigo Hidalgo
09/13/2023, 4:36 PMYury Fedotov
09/14/2023, 2:07 AM