I have a Config question, can't seem to find the r...
# questions
p
I have a Config question, can't seem to find the right answer that works with the current Kedro version. We're putting some of our pipeline specific config files into the same folder as the respective nodes.py and pipeline.py files. It's a simple yaml file and the reason is that we have the main /conf/parameters.yml file for global project settings, but say, learning rates for each pipeline are in those separate config files in their subfolders. Is there a best way to setup Kedro such that we can have access to those parameters in the pipelines.py file? E.g. instead of using "params:learning_rate" we would have "local_conf:learning_rate"? Or somehow inject them to "params:"? We generally want to avoid putting those pipeline-specific settings in the global conf folder.
d
p
thanks, will have a look and see if I manage to set it up πŸ™‚
I am still not sure how to make this local config available in the pipeline.py file. I can do this:
Copy code
config = load_local_pipeline_config(Path(__file__))
which gives me OmegaConfigLoader access. Then I have a pipeline of nodes:
def create_pipeline() -> Pipeline:
"""Train GCN model."""
return pipeline(
[
node(
func=to_pyg_graph,
inputs=["graph_configured", "config:num_features"],
outputs="data",
),
obviously "config:num_features" won't work here as Kedro thinks I am trying to invoke a member of the DataCatalog here. Only params works like this, but I'd like to have access like this to the local config.
is there even a way to override the "params:xxx" access in the pipeline.py in Kedro or am I trying to go down some rabbit hole here πŸ™‚
d
so the pattern is to have
params:some_config
as an input to a node: β€’ you can then override this with configuration environments β—¦
kedro run
will take the default param from
conf/base
structure β—¦
kedro run --parmas:some_config=10
will take directly from CLI β—¦
kedro run --env prod
will take from
conf/prod
in the file structure
πŸ‘ 1
p
so last question based on your answer if I may? if I have
conf/base/parameters.yml
and I have those pipeline-specific ones in:
src/package/pipelines/pipeline_foobar/local_config.yml
what's the best way to configure Kedro so that it merges these yaml files when doing
kedro run
? I could then just access everything via "params:" as designed πŸ™‚
d
your conf needs to live in the
conf/
folder
so you can do something like
conf/alt_1
conf/alt_2
conf/alt_3
and
kedro run --env conf/alt_1
i’m also assuming you want to share the same keys and override them based on the context
if want to just split config into multiple files to maintain it better
you can just do
parameters_a.yml
,
parameters_b.yml
etc
and they’ll be merged at runtime
p
yes we are trying to split our parameters because we have a big project and it becomes hard to follow
d
so share global scope
but they can live seperately
πŸ‘ 1
p
but the team asked me to put those local confs with the rest of the pipeline code
d
yeah thats not possible
conf and src is seperate by design
following principles of the https://12factor.net/config
πŸ‘ 1
m
I would not recommend it, but if you really really really want to have the confs in the src folder, you can write your own custom configloader. But like datajoely said, we recommend putting config in the conf folder. You can either create separate config environments, or you can put them in
parameters_[pipeline_name].yaml
πŸ‘ 1
p
I put everything in the conf folder in the end, if we stick to Kedro, we stick to Kedro πŸ˜‰ thanks for support
kedroid 1