Abhishek Bhatia
07/17/2024, 7:24 PM0.19.6
config loader: OmegaConfigLoader
In conf/base/catalog
globals.yml
has the following:
_base_path: <gs://dev-bucket-shared>
datasets.yml
has the following:
my_dataset:
type: pandas.CSVDataset
filepath: "${_base_path}/03_primary/my_dataset.csv"
In conf/local/catalog
globals.yml
has the following:
_base_path: <gs://dev-bucket-users/my_name>
• There is nothing else I put under conf/local
• The local conf should override the base conf when kedro run --env local --pipeline my_pipeline
is run but is does not. It is still using what is present in conf/base/catalog
Seems like something really basic I am doing wrong, any help is appreciated! 🙂Nok Lam Chan
07/17/2024, 8:02 PMNok Lam Chan
07/17/2024, 8:03 PMAbhishek Bhatia
07/17/2024, 8:41 PMcatalog/globals.yml
across environments, I can't do it unless I move everything to conf/base/globals.yml
2. If I do keep my catalog specific globals in catalog/globals.yml
then I have to make the entries start with _
3. In "normal" globals.yml
, I can't prefix them by _
4. I reference globals as {globals:some_key}
Nok Lam Chan
07/18/2024, 10:56 AMglobals.yml
in catalog/globals.yml
. That was created before the ${globals}
was implemented so there are some overlapping terminology.Nok Lam Chan
07/18/2024, 10:58 AMglobals.yml
, this is the only thing that are global in config loader and you can call with ${globals:}
in any type of configuration, i.e. parameters
, catalog
etc
self.config_patterns = {
"catalog": ["catalog*", "catalog*/**", "**/catalog*"],
"parameters": ["parameters*", "parameters*/**", "**/parameters*"],
"credentials": ["credentials*", "credentials*/**", "**/credentials*"],
"globals": ["globals.yml"],
Nok Lam Chan
07/18/2024, 11:00 AMcatalog_globals.yml
or parameters_globals.yml
, they are nothing but templating, you cannot refer them as ${globals}
in your configuraiton. In fact, if you copy all content of catalog_globals.yml
-> catalog.yml
, it is identical. This is more of a convention to separate out the template value to a separate file, but not mandatory.Abhishek Bhatia
07/18/2024, 11:02 AMcatalog/globals.yml
(technically not globals, now I understand), which must start with _
were not getting overriden in local
vs base
, so I switched to pure globalsNok Lam Chan
07/18/2024, 11:02 AM_
prefix can always be used (correct me if I am wrong), the reason that you want to prefix with _
is:
• Convention, it's clear that that value is mean to be used as template value rather than actual configuration
• for catalog.yml
, it's mandatory (not for parameters). This has to do with Kedro's DataCatalog validation. By using _
it bypass the process, and we know that the value is only a template value, i.e. _`_base_bucket= my_s3_bucket`_ , rather than an invalid dataset entry.Abhishek Bhatia
07/18/2024, 11:03 AM_base_bucket
is different in base
env vs local
env (user specific bucket), but was not getting overridenAbhishek Bhatia
07/18/2024, 11:05 AM_base_path
so the catalog entries in base
remain same as local
(which is not the intended outcome)Abhishek Bhatia
07/18/2024, 11:05 AMAbhishek Bhatia
07/18/2024, 11:06 AM_
can not be used in globals I thinkNok Lam Chan
07/18/2024, 11:10 AMAbhishek Bhatia
07/18/2024, 11:10 AMNok Lam Chan
07/18/2024, 11:12 AMSo maybe the resolution order of catalog entries is different?
In base, all catalog entries get resolved, then they get overriden by whatever is in local. Since I don't explicitly override catalog entries rather just theThis is true, and it has always been the case for Kedro configuration environment. Config are resolved within its own environment (otherwise there's no point of having a separate environment). It will get override withso the catalog entries in_base_path
remain same asbase
(which is not the intended outcome)local
local
environment, but without resolution (a dictionary merge basically)Nok Lam Chan
07/18/2024, 11:13 AMLet me double check on thiscan not be used in globals I think_
Abhishek Bhatia
07/18/2024, 11:14 AMbase_path
in globals.yml
then as opposed to using it as a templating variable _base_path
Then catalog entries get correctly overridenAbhishek Bhatia
07/18/2024, 11:14 AMNok Lam Chan
07/18/2024, 12:27 PMNok Lam Chan
07/18/2024, 4:37 PM_
is banned for $globals
? I looked up the PR but couldn't find the explanation, I think we should at least mention it in the docs.Ankita Katiyar
07/18/2024, 4:41 PM_
after resolution : https://github.com/kedro-org/kedro/blob/e2b20a49159d62680d0131d1338f06d84b340a44/kedro/config/omegaconf_config.py#L343-L350Nok Lam Chan
07/18/2024, 4:48 PM${globals: _some_config}
Ankita Katiyar
07/18/2024, 5:16 PM_some_config
key will be lost. Globals is loaded first so that it is resolved across environments and then when parameters are loaded it fills in keys from config_loader[globals]
and _some_config
doesn’t exist at that point