Lodewic van Twillert
09/03/2023, 6:36 PM0.18.13
now lets us use globals and templating variables with `OmegaConfigLoader`party wizard
But... I have trouble getting it to work with configuration for multiple environments.
Would you expect this simplified example to work??
I have a catalog.yml using templating variables, and I want to change the template variables for different environments. Hopefully then I don't need to re-define entries.
# base/catalog.yml
primary_input_1:
type: pandas.CSVDataSet
filepath: ${_storage.prefix}/data/primary/input_1.csv
credentials: ${_storage.credentials}
Base template for _storage
# base/catalog_templating.yml
_storage:
prefix: "base"
credentials: null
Local template for _storage
# local/catalog_templating.yml
_storage:
prefix: "data" # <-- different here
credentials: null
Template for _aws
as example
# local/catalog_templating.yml
_storage:
prefix: "<s3://my_data_bucket/data>"
credentials: dev_s3
When testing, the _storage
variable does not get overwritten when changing environments. I expect that the dataset in the base
environment, has a different filepath than in the local
environment. But... nothing changes in the local
env 😞
base
%reload_kedro ../ --env=base
catalog.datasets.primary_input_1._describe()
>>>
{
'filepath': PurePosixPath('./test-omega-templating/base/data/primary/input_1.csv'),
'protocol': 'file',
'load_args': {},
'save_args': {'index': False},
'version': None
}
local
%reload_kedro ../ --env=local
catalog.datasets.primary_input_1._describe()
>>>
{
'filepath': PurePosixPath('./test-omega-templating/base/data/primary/input_1.csv'),
'protocol': 'file',
'load_args': {},
'save_args': {'index': False},
'version': None
}
^ I expected 'filepath': ./test-omega-templating/data/data/primary/input_1.csv
Ankita Katiyar
09/03/2023, 6:42 PMglobals.yml
Lodewic van Twillert
09/03/2023, 7:07 PMglobals
because I wouldn't actually re-use these values for both params and catalog. The docs say The benefit of using globals over regular variable interpolation is that the global variables are shared across different configuration types, such as catalog and parameters.
and I didn't want that benefit - but it does suggest that 'regular variable interpolation' would do the same thing. And it would make my template variables a little shorter in syntax 🤓
Makes sense to use globals though, given how they are loaded. I will try that next, thanks for the explanation 👍globals.yml
- The only file that changes between each environment
• catalog.yml
- I don't wan't to re-define this in any environment, only 1 in base
• catalog_templating.yml
to capture the globals and turn them into composite parameters used throughout the catalog. And reduce syntax length in the catalog.yml
base
#globals.yml
storage:
prefix: base
credentials: null
folders:
raw: 01_raw
intermediate: 02_intermediate
primary: 03_primary
feature: 04_feature
model_input: 05_model_input
models: 06_models
model_output: 07_model_output
reporting: 08_reporting
#catalog.yml
primary_input_1:
type: pandas.CSVDataSet
filepath: ${_folders.primary}/input_1.csv
credentials: ${_credentials}
#catalog_templating.yml
_folders:
raw: ${globals:storage.prefix}/${globals:folders.raw}
intermediate: ${globals:storage.prefix}/${globals:folders.intermediate}
primary: ${globals:storage.prefix}/${globals:folders.primary}
feature: ${globals:storage.prefix}/${globals:folders.feature}
model_input: ${globals:storage.prefix}/${globals:folders.model_input}
models: ${globals:storage.prefix}/${globals:folders.models}
model_output: ${globals:storage.prefix}/${globals:folders.model_output}
reporting: ${globals:storage.prefix}/${globals:folders.reporting}
_credentials: ${globals:storage.credentials}
local
#globals.yml
storage:
prefix: data
credentials: null
aws
#globals.yml
storage:
prefix: "<s3://my_data_bucket/data>"
credentials: dev_s3
Ankita Katiyar
09/04/2023, 3:13 PM