Hi I m looking for the best suitable option for our use case Kedro #questions

Hi, I'm looking for the best suitable option for o...

Ralf Kowatsch

04/10/2025, 9:21 AM

Hi, I'm looking for the best suitable option for our use case which is a mixture between data engineering and data science. One of the possible solutions we are looking are • dbt https://docs.getdbt.com/ • kedro • sqlmesh Our main concerns are • time to market • scalability • maintainability • etc. The question that bugs me the most is whether there is some sql query push down solution in kedro? I saw https://ibis-project.org/ which looks like a dataframe solution for query push down but I would like to push my sql model in certain cases directly. Has anybody any idea?

datajoely

04/10/2025, 9:23 AM

So dbt is very similar conceptually to Kedro, although it does lean into the metrics / semantic model part more. Kedro is a python based tool for data engineering and data science Dbt is a tool for building sql transformations mostly in the data engineering space.

datajoely

04/10/2025, 9:23 AM

This is a slightly old blog because the ibis datasets are now in Kedro https://kedro.org/blog/building-scalable-data-pipelines-with-kedro-and-ibis

datajoely

04/10/2025, 9:24 AM

But in my opinion the modern way to use Kedro is with Ibis so that you’re also using SQL as an execution engine just like dbt

Ralf Kowatsch

04/10/2025, 9:31 AM

Hey thanks for the super fast response. That's exactly what I want because my transformations consist of heavy load which I would like to run on the rdms (snowflake) and ml models that are better run on open shift. I dont like an unnessesary abstraction layer if not needed and most of the times data engineers feel more confy writing sql than an python data frame quasi sql

❄️ 1

datajoely

04/10/2025, 9:31 AM

Yeah so ibis is the best

datajoely

04/10/2025, 9:32 AM

And it also makes your code portable so one syntax runs on any backend

datajoely

04/10/2025, 9:32 AM

I also love the pattern of having dev run on duckdb locally and prod/staging running on BQ/Snwoflake etc

👍 1

datajoely

04/10/2025, 9:34 AM

Like you said it comes down to team topologies too - sql has a lower barrier to entry so if you have a constrained analyst team then dbt is great. If you want everyone working in the same language ibis / python / Kedro is a great fit

datajoely

04/10/2025, 9:34 AM

You cannnn if you want so

ibis.sql(…)

and just write sql in python if you want too

👍 1

datajoely

04/10/2025, 9:36 AM

Sql mesh is also new and interesting but stlll quite an early days project - still SQL first like dbt but benefits massively from seeing some of the mistakes dbt made first time around

Dmitry Sorokin

04/10/2025, 10:19 AM

one more blog post with example how to use Kedro Ibis Dataset https://kedro.org/blog/sql-data-processing-in-kedro-ml-pipelines

datajoely

04/10/2025, 10:25 AM

@Dmitry Sorokin I’m not able to do this where I am, but we need an activity to update all of these blog posts to use the datasets in Kedro extras

👍 2

Deepyaman Datta

04/10/2025, 3:47 PM

+1 to what @datajoely said about

ibis.sql

. I wrote an article a couple months ago about how this all works, and tl;dr it's more-or-less equivalent if you drop into

ibis.sql

Ralf Kowatsch

06/19/2025, 12:45 PM

Thanks for all your inputs. We'll go with snowpark since we have snowflake and need udtf's.Ibis seems like a really good option but the missing udtf implementation is a bummer. Thanks for all your input

Deepyaman Datta

06/19/2025, 2:07 PM

@Ralf Kowatsch Makes sense! Yes, unfortunately Ibis doesn't support this, because there's no consistency in how backends handle this: https://github.com/ibis-project/ibis/issues/8108#issuecomment-1912673484 Feel free to throw an issue up on the Ibis repo, if you're open to it; I think it's good to at least have a signal that something like this could be a reason people would go with Snowpark instead (or, the maintainers may be able to even provide some workaround; I know there are also some Snowflake Solutions Architects who have done quite a bit of work with Ibis and may have some thoughts).

10 Views

Open in Slack

Previous Next