Hi, let's say I have this simple `SQLQueryDataset...
# questions
a
Hi, let's say I have this simple
SQLQueryDataset
Copy code
table_output:
  type: pandas.SQLQueryDataset
  sql: >
       select * from dev.output
I also have
parameters.yml
Copy code
mode: 'ABC'
The expectation now is to update
table_output
to filter itself based on parameter
mode
If I use the following hard coded load_args method, the
table_output
will be filtered based on the
'ABC'
params
Copy code
table_output:
  type: pandas.SQLQueryDataset
  sql: >
       select * from dev.output
       where mode = ?
  load_args:
     params:
       - 'ABC'
How can I reuse the parameters defined in
parameters.yml
as part of the
table_output
definition? Something like
Copy code
table_output:
  type: pandas.SQLQueryDataset
  sql: >
       select * from dev.output
       where mode = ?
  load_args:
     params:
       - ${parameters.mode} # to reference parameters in parameters.yml?
h
Someone will reply to you shortly. In the meantime, this might help:
a
@Ankita Katiyar thanks! globals.yml seems to be what I need. I'm just curious as to why it's not used as the default place to hold parameters definition, given that it's accessible across all config files, which I think very useful in many cases. The only thing now is that I have to redefine my parameters in parameters.yml so that it resolves from globals.yml, which seems to be like a double definition.
n
It's mostly for separation.
parameters.yml
are meant to be node argument, something that get passed into your node function. Where the
parameters.mode
here are acting more like a template variable. You could also move this
mode
to
catalog.yml
Copy code
table_output:
  type: pandas.SQLQueryDataset
  sql: >
       select * from dev.output
       where mode = ${_mode}
  load_args:
     params:
       - ${_mode} # to reference parameters in parameters.yml?

_mode: "some_mode"
`````` For things that are absolutely global (needed for catalog/parameters and other stuff), then you should use
globals
(the anti-pattern here is use
globals.yml
for everything)
K 1
m
is it possible to do something similar with a namespaced dataset?
{mycountry}.table_output:
type: pandas.SQLQueryDataset
sql: >
select * from mytable.output
where country = ${mycountry}
load_args:
params:
- ${{mycountry}.name}
This can be done with factories, though I think it is built for a slightly different purpose.
m
Yes you are right. You can use the namedspace values e.g. {mycountry} directly in the params or filepath etc. but you can't use it as a key value. For example if in params you had
France:
short_name: FR
you can't get the "FR" in the catalog.