Maybe this is pushing the current dataset factories too far Kedro #questions

Maybe this is pushing the current dataset factorie...

Luis Chaves Rodriguez

02/03/2025, 12:23 PM

Maybe this is pushing the current dataset factories too far but is it possible to parametrise a SQL Catalog entry where the SQL is read from a file? Like:

Copy code

mytable:
  type: pandas.SQLQueryDataset
  credentials: postgres_dwh
  filepath: sql/mytable.sql

basically, I'd like to pass parameters to the SQL query

Hall

02/03/2025, 12:23 PM

Someone will reply to you shortly. In the meantime, this might help:

Ankita Katiyar

02/03/2025, 1:59 PM

Hey Luis, I think you could technically do it -

Copy code

mytable_{table}:
  type: pandas.SQLQueryDataset
  credentials: <cred>
  sql: SELECT * from {table}

And in your

pipeline_registry.py

pipeline.py

, have a script that reads the queries from a file and generates the pipeline dynamically. The way dataset factories works is that it reads the dataset name from the pipeline inputs/outputs and then fills in the placeholders in the catalog entry, so the dataset names might get crazy looking

Luis Chaves Rodriguez

02/03/2025, 2:00 PM

ok, but not if the SQL is in a file right?

Ankita Katiyar

02/03/2025, 3:14 PM

Yeah, the factory placeholder wouldn’t work within a file because the catalog wouldn’t load the queries, the file would be read inside the dataset initialisation part

👍🏼 1

Nok Lam Chan

02/04/2025, 11:08 AM

This is where

ibis

would be a better fit for SQL parameterisation / multi-nodes SQL lazy evaluated

Luis Chaves Rodriguez

02/04/2025, 11:13 AM

Are there docs on this? @Nok Lam Chan

Nok Lam Chan

02/04/2025, 11:15 AM

https://docs.kedro.org/projects/kedro-datasets/en/latest/api/kedro_datasets.ibis.TableDataset.html I am not 100% sure if it supports the SQL interface, as ibis native interface is dataframe. So that may requires some change.

🙌🏼 1

Nok Lam Chan

02/04/2025, 11:16 AM

https://kedro.org/blog/building-scalable-data-pipelines-with-kedro-and-ibis

3 Views

Open in Slack

Previous Next