Hi kedro! I’ve been trying to update parameters d...
# questions
m
Hi kedro! I’ve been trying to update parameters defined in a base file for each environment. It looks like the following and would like to add the correct parameter values from each environment instead of the base, if found in other environments
Copy code
conf
├── README.md
├── base
│   ├── catalog.yml
│   ├── credentials.yml
│   ├── logging.yml
│   └── parameters.yml
├── local
│   ├── credentials.yml
│   └── parameters.yml
├── prod
│   ├── catalog.yml
│   ├── parameters.yml
conf/base/parameters.yml
Copy code
env: local
random_state: 3
target_column: y

data_processing:
  target_column: ${target_column}
  random_state: ${random_state}

training:
  train_fraction: 0.8
  random_state: ${random_state}
  target_column: ${target_column}
  env: ${env}

evaluation:
  env: ${env}
This is how my first attempt for the hook looks like
Copy code
class MyCustomHook:
    @hook_impl
    def after_context_created(self, context: KedroContext):
        # Load base parameters
        base_params = context.params

        # Determine current environment
        env = context.env

        # Load environment-specific parameters
        env_params_path = f"conf/{env}/parameters.yml"
        env_params = OmegaConf.load(env_params_path)

        # Merge parameters with priority to environment-specific params
        merged_params = OmegaConf.merge(base_params, env_params)

        # Ensure environment variable interpolation
        merged_params_with_env = OmegaConf.create({"env": env, "params": merged_params})
        OmegaConf.resolve(merged_params_with_env)

        # Update context params
        context._params = merged_params_with_env.params
However, the env parameter is only updated top level. I want the env parameter to be updated and resolved in all nested paths too. Alternatively, if i don’t need to nest the parameters at all and all parameters would be available to each pipeline
m
Why do you want to pass Kedro env to nodes / params?
m
The core issue is that I have a training and prediction pipeline that need to be executed at different intervals and save a model to S3. I want to pass in env when saving the model, to separate and not override the model being saved for each environment. Think this still could work with passing KEDRO_ENV instead, but wanted to make things more explicit
m
🤔 I’m not sure I get it fully, but seems like you could just have a separate data catalog entry for models.