Hi, Need help with the below data catalog ```sql_...
# questions
a
Hi, Need help with the below data catalog
Copy code
sql_table:
  type: pandas.SQLTableDataset
  table_name: RD
  load_args:
    schema: RD.dev

sql_table:
  type: pandas.SQLTableDataset
  table_name: RD
  load_args:
    schema: RD.prd
Basically I want to be able to parameterize the schema in parameters.yml
Copy code
schema: "dev" # prd
I tried updating my data catalog as below
Copy code
sql_table:
  type: pandas.SQLTableDataset
  table_name: RD
  load_args:
    schema: RD.${params.schema}
And get this error, but haven't been able to debug it unfortunately. Appreciate any advice on this. Thanks!
Copy code
InterpolationKeyError: Interpolation key 'params.schema' not found
    full_key: sql_table.load_args.schema
    object_type=dict
h
Someone will reply to you shortly. In the meantime, this might help:
c
Does this require defining a custom OmegaConf resolver in
settings.py
?
a
here's my
settings.py
. Essentially, I've uncommented the parameters so that the OmegaConf will work on the parameters
Copy code
CONFIG_LOADER_ARGS = {
      "base_env": "base",
      "default_run_env": "local",
      "config_patterns": {
          "spark" : ["spark*/"],
          "parameters": ["parameters*", "parameters*/**", "**/parameters*"],
      }
}
c
Maybe you want a
dev
environment and
prd
environment. The majority of the dataset definitions would be the same but the
dev
and
prd
catalogs would differ for the
schema
key for each dataset's
load_args
.
a
I understand. Our current setup makes it quite flexible to switch
dev
and
prd
within the same codebase mainly because there are situations where we need to develop using
prd
data as input. Not the best practice, yes.
c
I think you can do what you're looking for using Kedro's additional configuration environments. By default there's a
conf/base/catalog.yml
and
conf/local/catalog.yml
. If you create
conf/dev/catalog.yml
and
conf/prd/catalog.yml
where the schemas/lifepaths are different you can do
kedro run --env prd
to run using the production data.