Hello! :wave: While defining catalog entries and ...
# questions
a
Hello! ๐Ÿ‘‹ While defining catalog entries and parameters in Kedro, is there a way to have functionality similar to how
conftest
fixtures behaves in
pytest
? i.e. the visibility of fixtures to the modules is at the same level as
conftest.py
+ all modules below that level. So, in the same way, I want to have a globals file, and then in one of my nested folders, the same named parameters should be overriden by another globals. Let me know if this is a Kedro antipattern ๐Ÿ™‚
m
If I understand correctly, Kedro environments are what you are looking for, by default itโ€™s 2-level (base + specific), but you can extend this functionality.
n
Environment is the closest thing I can think of. If you need arbitrary deep hierarchy, Kedro doesn't offer that out of the box. I have seen custom integration with things like Hydra. This is however not our priority as we think this sacrafice too much of readability, it is DRY and reduce duplicate configs but also make it harder to read or debug configurations and getting closer to use general programming language as configuration
You may also gain some extra flexibility with OmegaConf resolver, which you can register custom logic
a
Ah makes sense. I guess, what I am trying to achieve is two-fold: 1. Have a deep hierarchy of parameter configs (with deeper overriding the shallower configs) 2. Being able to import config from one yaml to another, and prepend an optional namespace
Copy code
{import:/path/to/config.yml:<namespace>}
Does this make sense for a viable feature down the road? ๐Ÿค”
@Nok Lam Chan Could OmegaConf resolvers be used to make parameter factories similar to data factory pattern in catalog?
n
OmegaConf resolvers work on values but not key, so there may be some limitations. 1&2 sounds a lot like Hydra, and we considered it but end up choosing only OmegaConf. It's more likely you will need to implement this yourself or find some implementation online.
๐Ÿ‘ 1
Can you explains what do you mean parameters factory? How would you expect it to work and what are you using for?
a
I am talking about dataset factories in `catalog`: https://docs.kedro.org/en/stable/data/kedro_dataset_factories.html
Copy code
"{namespace}.{dataset_name}@spark":
  type: spark.SparkDataset
  filepath: data/{namespace}/{dataset_name}.pq
  file_format: parquet

"{dataset_name}@csv":
  type: pandas.CSVDataset
  filepath: data/01_raw/{dataset_name}.csv
So, is a similar thing possible in
parameters
as well?
Parameter factory could be used to assign namespaced parameters for multiple runs of a namespaced pipeline
m
See
2. Adjusting the parameters.yml
of https://getindata.com/blog/kedro-dynamic-pipelines/
a
@marrrcin Yup, we are doing exactly this!
m
So out of the box, thereโ€™s probably nothing more than that and as @Nok Lam Chan said, you would have to come up with your own solution to your specific use case ๐Ÿ™‚
a
Yep, looks like that ๐Ÿ™‚
n
Parameter factory could be used to assign namespaced parameters for multiple runs of a namespaced pipeline
https://github.com/kedro-org/kedro/issues/3086 @Abhishek Bhatia If you could give an example how it will look like in
parameters.yml
or
pipeline.py
, it would be great. This has been requested before, but it's still unclear what is needed. Is the OmeaConf variable interpolation or resolver not enough and how extending factory pattern to parameters would help?