Fazil B. Topal
08/16/2023, 6:25 PM_pandas:
type: pandas.CSVDataSet
someconfhere: ....
companies:
type: ${_pandas.type}
filepath: data/01_raw/companies.csv
conf/staging
:
_pandas:
type: pandas.CSVDataSet
stagingConfHere: ....
We wanna define the generic catalog in the base and other related ones in the following confs. However this doesn't work at the moment, we have the to define the whole companies
thing again. I believe it's due to this logic here: https://github.com/kedro-org/kedro/blob/main/kedro/config/omegaconf_config.py#L297
When we load the base catalog, we resolve the values and then we read the next env and resolve it again. Our example above only works if resolving is done last. I am not sure if it was a design choice or not. I thought about overwriting some functions but load_and_merge is a big one and i'd have to overwrite the __getitem__
as well but that can done easily i think.
Thanks in advance!Yolan Honoré-Rougé
08/16/2023, 8:26 PMAnkita Katiyar
08/17/2023, 9:25 AMOmegaConfigLoader
. We don’t necessarily allow for variable interpolation across environments.conf/base/globals.yml
or conf/staging/globals.yml
which would be loaded and resolved first (duplicate keys in staging would overwrite base in globals) . And then you could inject this into your catalog like -
companies:
type: "${globals:_pandas.type}"
filepath: data/01_raw/companies.csv
Fazil B. Topal
08/17/2023, 9:32 AMAnkita Katiyar
08/17/2023, 9:37 AMOmegaConfigLoader
with overridden __getitem__
and load_and_merge_dir
where the resolution for variable happens outside the load_and_merge_dir
method. I can help if you’d like.Fazil B. Topal
08/17/2023, 9:40 AMYolan Honoré-Rougé
08/17/2023, 10:09 AM