in other news, I'm having trouble passing `dtypes`...
# questions
j
in other news, I'm having trouble passing
dtypes
to the upcoming
polars.CSVDataSet
, not sure if there's a way to specify non-primitive types in the catalog YAML? https://github.com/kedro-org/kedro-plugins/issues/124
d
Unfortunately no, we can now introduce a OmegaConf resolver for this
In pandas most of them map to strings so it’s not a problem
j
yeap, that's what I imagined
do you have any pointers to how could we do this in OmegaConf by any chance?
d
it’s a engineering question, in truth we’re not sure how to approach this yet as it touches wider questions
we’ve held off enabling ‘unsafe pyyaml’ mode for security reasons
👀 1
j
yeah I meant "quick & dirty workaround" 😄 found https://omegaconf.readthedocs.io/en/2.3_branch/usage.html#resolvers
d
quick and dirty is you subclassing the polars class and handling it yourself
🙉 1
but there is a nice resolver out there where we
pl.xxxx
is resolved in a safe way
d
we also have users who want to use the
converters
element of
pd.read_excel
which take full functions as arguments and there’s no way to expose that as YAML today
@marrrcin I’ve never thought of that!
that’s clever
it’s hacky but works
j
about the
!!python
YAML stuff, I tried but
could not determine a constructor for the tag 'tag:<http://yaml.org|yaml.org>,2002:python/name:polars.Float64'
d
yeah sorry I was saying we haven’t enabled it for security reasons
j
oh, gotcha
wohooo the
TemplatedConfigLoader
trick worked, thanks @marrrcin!
K 1
😎 1
d
@Merel tagging for visibility as longer term we should have an OmegaConf solution here 🙂
👍 2
In my mind it would be nice to expose a
OmegaConf
resolvers part of settings.py a bit like @marrrcin has done with his workaround
m
Yeah, it would be great to have OmegaConf resolvers configurable, right now I had to create a subclass of OmegaConfigLoader to run environment resolvers for non-credentials configs
m
@marrrcin Would you be able to share that class? I’d love to see it! And another question is, what are you using environment variables for outside of credentials?
m
It’s nasty, don’t look 🙈
Copy code
class PatchedOmegaConfigLoader(OmegaConfigLoader):
    """
    This class is a patch for the OmegaConfigLoader class, it enables the use of oc.env interpolation
    for all config files.
    """

    def load_and_merge_dir_config(
        self,
        conf_path: str,
        patterns: Iterable[str],
        read_environment_variables: Optional[bool] = False,
    ) -> Dict[str, Any]:
        return super().load_and_merge_dir_config(
            conf_path, patterns, read_environment_variables=True
        )
And in the
settings.py
Copy code
CONFIG_LOADER_CLASS = PatchedOmegaConfigLoader
# Keyword arguments to pass to the `CONFIG_LOADER_CLASS` constructor.
CONFIG_LOADER_ARGS = {"config_patterns": {"azureml": ["azureml*"]}}
I wanted to have some config entires for kedro-azureml plugin be settable from CI/CD at runtime, so I’ve used the env interpolation there
👍 1
m
Thank you! That’s very insightful
m
In a way, the settings in our plugins are similar to settings of credentials, as they are more environment-specific and have practically no influence on the reproducibility of the pipelines
m
@marrrcin Have you by any chance played around with using custom resolvers with the
OmegaConfigLoader
? I’m trying out how to make this work nicely. I can get it working when subclassing
OmegaConfigLoader
but I’m wondering if there’s an easier way
m
Not yet
m
Let me know if you do! 🙂
👍 1
d