I am trying to load data based on the base environ...
# questions
g
I am trying to load data based on the base environment, but I am getting an error:
*ValueError*: Duplicate keys found in path\to\conf\dev\catalog.yml and path\to\conf\base\catalog.yml: foo_dataset, bar_dataset, baz_dataset
Here is how I am trying to load the base catalog:
Copy code
import os
from pathlib import Path
from kedro.config import OmegaConfigLoader
from <http://kedro.io|kedro.io> import DataCatalog
os.chdir(Path(__file__).parent.parent)
conf_loader = OmegaConfigLoader(conf_source="conf", env='base')
conf_catalog = conf_loader.get("catalog")
catalog = DataCatalog.from_config(conf_catalog)
I can see the error is a result of duplicate dataset names in the configurations. For the dev environment these are the same datasets but saved to a CSV, whereas base I am writing the data to a database. Is it idiomatic Kedro to just rename these? Or is there a feature I am missing for how to load only one and not the other?
j
Hi Galen, Thank you for your question. Have you check Below Kedro docs? https://docs.kedro.org/en/stable/configuration/configuration_basics.html#configuration-loading
n
Hey @Galen Seilis It has been a while. The root cause of this is because Kedro is still loading 2 environments. In your code you only set
env=base
, but there is also a
default_run_env=""
, that is why you have the duplicate key I believe. We fixed this recently I think, but I am not sure if it has make to the release yet.
For now, if you only using 1 single environment, you can use
default_run_env=local
(which is the settings when you have a kedro project) or
default_run_env=base
, both should work