Hi everyone, Is there a way to interpolate values...
# questions
m
Hi everyone, Is there a way to interpolate values from
credentials.yaml
to some other parameters
yaml
file ? My use case is the following: I need to pass my openai_api_key as a param to a model. I’ve added the key to
credentials.yml
and tried in
modeling.yml
to do
llm_api_key: ${OPENAI_API_KEY}
but ended with the following error:
ValueError: Failed to format pattern '${OPENAI_API_KEY}': no config value found, no default provided
Thx in advance, M.
d
you can use env vars here
a
You’d be able to do it with globals too - either with
TemplatedConfigLoader
currently or
OmegaConfigLoader
in the next release.
m
Thx @datajoely & @Ankita Katiyar Regarding the use of env vars: Would you do that directly in the python script with
os.environ.get()
or can it be also done kedro’s parameters
yaml
files ? If so how ? 🙂 Regarding the use globals: I’ll give it a shot, and will get back to you if I can’t make it work. Thx again 🙏🏼
m
Thx @datajoely, but the doc says Note that you can only use the resolver in
credentials.yml
and not in catalog or parameter files. This is because we do not encourage the usage of environment variables for anything other than credentials.
What I need is a way to pass my openai_api_key as a parameter to a node (not the typical use case of credentials for datasets in catalog), and was wandering if there is a “kedro way” of doing it. So far I’ve simply created a
.env
file and used
python-dotenv
directly in my script. But, I do not like this approach because it kinds of ‘override’ / ‘obfuscate’ (cant find the right word 😅 ) the params declared in the conf’s yaml files. I’ll see if using globals helps 🙂 Thx anyway. Cheers M
l
@Marc Gris I am not part of the Kedro team but run into similar issues I prefer to keep all credentials in credentials.yml files in the catalog. The way I do that is like this: 1. Create a very generic
TokenDataSet
, with a credentials argument. 2. Add
open_api_token
to your credentials.yml 3. Use the
TokenDataSet
in your data catalog, passing the credentials as an argument 4. Use
TokenDataSet
as an input to your nodes `TokenDataSet`:
Copy code
from typing import Any

from kedro_datasets._io import AbstractDataset

class TokenDataSet(AbstractDataset[None, str]):
    def __init__(self, credentials: str):
        self.credentials = credentials

    def _load(self):
        return self.credentials

    def _save(self):
        raise NotImplementedError("Saving TokenDataSet not allowed.")

    def _describe(self) -> dict[str, Any]:
        return dict(type=self.__class__.__name__)
catalog.yml
:
Copy code
token_dataset:
  type: yourpackage.datasets.TokenDataSet
  credentials: open_api_token
credentials.yml:
Copy code
open_api_token: my_super_secret_token
--- In an interactive session you can test with
Copy code
catalog.load("token_dataset")
>>> 'my_super_secret_token'
🚀 1
a
Maybe using a custom resolver with
OmegaConfigLoader
is also a solution:
Copy code
from kedro.config import OmegaConfigLoader
from omegaconf import OmegaConf
from typing import Any, Dict


class CustomOmegaConfigLoader(OmegaConfigLoader):
    def __init__(
        self,
        conf_source: str,
        env: str = None,
        runtime_params: Dict[str, Any] = None,
    ):
        super().__init__(
            conf_source=conf_source, env=env, runtime_params=runtime_params
        )

        # Register a customer resolver that adds up numbers.
        self.register_custom_resolver("credentials", lambda creds: self.lookup_creds(creds))

    @staticmethod
    def register_custom_resolver(name, function):
        """
        Helper method that checks if the resolver has already been registered and registers the
        resolver if it's new. The check is needed, because omegaconf will throw an error
        if a resolver with the same name is registered twice.
        Alternatively, you can call `register_new_resolver()` with  `replace=True`.
        """
        if not OmegaConf.has_resolver(name):
            OmegaConf.register_new_resolver(name, function)
    
    def lookup_creds(self, creds):
        return self["credentials"][creds]
And then you could reference it in your `parameters.yml`:
Copy code
open_api_key: "${credentials:open_api_token}"
Docs link - https://docs.kedro.org/en/stable/configuration/advanced_configuration.html#how-to-use-custom-resolvers-in-the-omegaconfigloader
❤️ 2
With the next release (
0.18.13
), registering custom resolvers for
OmegaConfigLoader
will be much simpler (no need for a
CustomOmegaConfigLoader
implementation) - Docs -https://docs.kedro.org/en/latest/configuration/advanced_configuration.html#how-to-use-resolvers-in-the-omegaconfigloader Globals will also be released soon in
0.18.13
where you can have in your
globals.yml
file -
Copy code
OPEN_API_KEY: <your_key>
And then reference it in your
parameters.yml
-
Copy code
llm_api_key: "${globals:OPEN_API_KEY}"
Docs for this - https://docs.kedro.org/en/latest/configuration/advanced_configuration.html#how-to-use-global-variables-with-the-omegaconfigloader
👍🏼 1
Currently, environment variables are only resolved for
credentials
and
credentials
are only used in the
DataCatalog
. So @Lodewic van Twillert’s is also great (and creative 😄) and solution should work!
👍🏼 1
👍 1
m
Thx a lot for your messages and suggestions @Ankita Katiyar & @Lodewic van Twillert 🙂 🙏🏼