Jakub Szafranski
01/07/2025, 11:20 AMHall
01/07/2025, 11:20 AMDeepyaman Datta
01/07/2025, 5:23 PMJuan Luis
01/07/2025, 6:40 PMDeepyaman Datta
01/07/2025, 6:43 PMI know there is a side project, the kedro-dagster plugin repo, which is in its early stages of development, but it doesn't have comprehensive documentation yet, and I'm unsure about the integration process.
Juan Luis
01/07/2025, 6:50 PMGuillaume Tauzin
01/08/2025, 6:46 AMJakub Szafranski
01/09/2025, 8:54 AMGuillaume Tauzin
01/09/2025, 9:36 AMJakub Szafranski
01/09/2025, 10:15 AMGuillaume Tauzin
01/09/2025, 11:22 AMdefinitions.py
would import all kedro-translated dagster objects and the user would be free to edit them or add things like schedule before passing everything to the Dagster Definitions
at the end of the file. As the project is still early stage, I can't tell you for sure that a user will be able to fully leverage Dagster's capabilities. But this is the ultimate objective (and any help is welcome!).
To me there are several advantages to using kedro on top of dagster. To cite just a few:
• It is very easy to create new data asset and pipelines in kedro, it is much harder to do so in dagster. Kedro doc, tutorials and slack community are all extremely high quality. Typically, data scientists are not experts in software engineering, so working directly with dagster can be daunting.
• Kedro structures well your DS project and handles configuration and data connectors definition. This takes out a lot of the complex side of working on a DS project.
• Kedro is orchestrator-independant, so if you decide to change later on because it no longer match your need, you don't have to rewrite everything.Jakub Szafranski
01/09/2025, 2:03 PM