Hi kedroids! Sorry for noob question. I’m working ...
# questions
e
Hi kedroids! Sorry for noob question. I’m working with sql database as source of data and pandas.SQLQueryDataSet works well
Copy code
sample_sql_query_data:
  type: pandas.SQLQueryDataSet
  credentials: postgres_re_db
  sql: SELECT * FROM rr_norm.sample_gov_torgi
Unfortunately, the amount of queries grows fast and catalog.yaml starts bloating with long query strings. Also, it looks like not a good idea to keep sql queries strings within the catalog.yaml itself for reproducibility. What would be the most kedroic/pythonic approach to extract queries from the catalog.yaml to a separate folder/module? AFAIK (or understood from googling) yaml doesn’t natively has include/import features?
b
You can simply point the dataset to a file with the query, using the
filepath
argument instead of
sql
(see the docs https://kedro.readthedocs.io/en/stable/kedro.extras.datasets.pandas.SQLQueryDataSet.html)
👍 1
these can be kept in the
data
folder, or in any other folder in the project that you create (e.g.,
sql/
)
👍 1
e
Omg, thank you so much! And thats a great reminder to read docs till the very end :)) the filepath parameter IS (and always was) there!
🥳 1