I am trying to use custom resolvers to provide cre...
# questions
v
I am trying to use custom resolvers to provide credentials in catalog.yml
document_classification:
type: ibis.TableDataset
table_name: document_classification
connection:
backend: ${oc.env:BACKEND}
host: ${oc.env:HOST}
port: ${oc.env:PORT}
database: ${oc.env:DATABASE}
user: ${oc.env:USER}
password: ${oc.env:PASSWORD}
CONFIG_LOADER_ARGS = {
"base_env": "base",
"default_run_env": "local",
"custom_resolvers" : {
"oc.env" : oc.env
}
}
Is it the right way to do it
d
Copy code
CONFIG_LOADER_ARGS = {
    "base_env": "base",
    "default_run_env": "local",
    "custom_resolvers" : {
        "oc.env" : oc.env
    }
}
This part needs to into
settings.py
also don't forget to import
Copy code
from omegaconf.resolvers import oc
v
~yeaah that's done . I have also created .env file~
Copy code
BACKEND=""
HOST=""
PORT=""
DATABASE=""
USER="vishalp"
PASSWORD=""
~Using python-dotenv to load these env variables . But when I print the USER env on console , it is skipping the last character for some reason , very weird . 09/17/24 145858] INFO Using 'conf/logging.yml' as logging configuration. You can change this by setting the KEDRO_LOGGING_CONFIG environment variable accordingly. init.py:249 INFO .env file loaded successfully env_loader.py:13 DEBUG BACKEND : postgres env_loader.py:29 DEBUG HOST : ******* env_loader.py:29 DEBUG PORT : ** env_loader.py:29 DEBUG DATABASE : ****** env_loader.py:29 DEBUG USER : vishal env_loader.py:29 DEBUG PASSWORD : ***** env_loader.py:29 INFO All Env Variables loaded Successfully~
@datajoely just check the DEBUG msg for USER : , you will find the trailing "p" is skipped
Please ignore the above. It is resolved, For some reason , dotenv was not loading the updated variables
@datajoely is there a better way to declare all these items in catalog, like there is too much of redundancy
Copy code
document_classification:
  type: ibis.TableDataset
  table_name: document_classification
  connection:
    backend: ${oc.env:BACKEND}
    host: ${oc.env:HOST}
    port: ${oc.env:PORT}
    database: ${oc.env:DATABASE}
    user: ${oc.env:USER}
    password: ${oc.env:PASSWORD}

case_master:
  type: ibis.TableDataset
  table_name: case_master
  connection:
    backend: ${oc.env:BACKEND}
    host: ${oc.env:HOST}
    port: ${oc.env:PORT}
    database: ${oc.env:DATABASE}
    user: ${oc.env:USER}
    password: ${oc.env:PASSWORD}

user_master:
  type: ibis.TableDataset
  table_name: user_master
  connection:
    backend: ${oc.env:BACKEND}
    host: ${oc.env:HOST}
    port: ${oc.env:PORT}
    database: ${oc.env:DATABASE}
    user: ${oc.env:USER}
    password: ${oc.env:PASSWORD}
d
Yes! You're now looking for dataset factories https://docs.kedro.org/en/stable/data/kedro_dataset_factories.html
🙌 1
n
You probably don't even need factory yet, use interpolation (template value basically) https://docs.kedro.org/en/stable/configuration/advanced_configuration.html
🙌 1
👍 1
d
Oh true!
v
@Nok Lam Chan i was just trying a code snippet given in official kedro docs as mentioned below , but it looks like the catalog is not resolved properly when we use oc.env resolver
Copy code
from kedro.config import OmegaConfigLoader
from kedro.framework.project import settings
from <http://kedro.io|kedro.io> import DataCatalog
from pathlib import Path

project_root = "/home/vishal/Documents/workspace/mlops/data-pipelines/"
conf_path = str(Path(project_root) / settings.CONF_SOURCE)

# Instantiate an `OmegaConfigLoader` instance with the location of your project configuration.
conf_loader = OmegaConfigLoader(
    conf_source=conf_path, base_env="base", default_run_env="local"
)

# These lines show how to access the catalog and credentials configurations.
conf_catalog = conf_loader["catalog"]
conf_credentials = conf_loader["credentials"]

# # Fetch the catalog with resolved credentials from the configuration.
# catalog = DataCatalog.from_config(catalog=conf_catalog, credentials=conf_credentials)
Error
Copy code
in <module>:15                                                                                   │
│                                                                                                  │
│   12 )                                                                                           │
│   13                                                                                             │
│   14 # These lines show how to access the catalog and credentials configurations.                │
│ ❱ 15 conf_catalog = conf_loader["catalog"]                                                       │
│   16 conf_credentials = conf_loader["credentials"]                                               │
│   17                                                                                             │
│   18 # # Fetch the catalog with resolved credentials from the configuration.                     │
│                                                                                                  │

UnsupportedInterpolationType: Unsupported interpolation type oc.env
    full_key: document_classification.connection.backend
    object_type=dict
The section explains this in detail, but in short you need to turn on this settings because
oc.env
by default are enabled for credentials only.
For the context, this is a bit of legacy, since Kedro introduced
credentials
year ago and resolver comes later. In the future we are thinking to introduce a credentials resolver.
👍 1
v
What does it mean by oc.env is enabled for credentials only ?? - Can you explain this a bit more
n
OmegaConf
also comes with some built-in resolvers that you can use with the
OmegaConfigLoader
in Kedro. All built-in resolvers except for
oc.env
are enabled by default.
oc.env
is only turned on for loading credentials. You can, however, turn this on for all configurations through your project’s
src/<package_name>/settings.py
in a similar way:
l
Hi, team. I have a similar error, obtained when running kedro viz command*:* error on omegaconf/base.py
Copy code
custom_resolver
    raise UnsupportedInterpolationType(
omegaconf.errors.UnsupportedInterpolationType: Unsupported interpolation type path
    full_key: _path
    object_type=dict
I'm not really sure if that's related to it, but I have: • In conf/base/catalog_globals.yml: ◦
_path: ${path:}/${run_folder:}/
• In conf/base/environ.yml: ◦
local:path: data/runs/
Could someone help troubleshoot this?
n
is this related to the thread? I am a bit confused.
l
sorry for the confusion...actually, the error is the same (UnsupportedInterpolationType) but its coming from running another command: kedro viz
n
do you get the same error when you just run any kedro command?
l
no..I only get this error when running kedro viz
j
@LĂ­via Pimentel can you try
_path: ${path}/${run_folder}/
? the
${path:}
is telling OmegaConf that there's a resolver called
path
, hence the
UnsupportedInterpolationType
n
Maybe you are trying to use variable interpolation (template value)? Can you give an example what is the expected value? is
environ.yml
a catalog?
l
Hello, Juan and Nok. Thank you for your reply. I tried replacing the
_path: ${path:}/${run_folder:}/
by
_path: ${path}/${run_folder}/
and it still showed the error above.
environ.yml
is contained in conf/base and has the following :
Copy code
# Pass the path to your Databricks volume here.
databricks:
  path: /Volumes/prod_us_cpibaws_5edb792/${env:CATALOG_ENV,'default'}/temp
local:
  path: data/runs/
j
hi @LĂ­via Pimentel, could you paste the full traceback after you changed the
_path
definition?
l
Hi, Juan. Thanks again for your reply, and apologies for the delay. Here is the full traceback:
Copy code
Starting Kedro Viz ...
Process Process-1:
Traceback (most recent call last):
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap
    self.run()
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro_viz/server.py", line 112, in run_server
    load_and_populate_data(path, env, include_hooks, extra_params, pipeline_name)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro_viz/server.py", line 62, in load_and_populate_data
    catalog, pipelines, session_store, stats_dict = kedro_data_loader.load_data(
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro_viz/integrations/kedro/data_loader.py", line 105, in load_data
    catalog = context.catalog
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro/framework/context/context.py", line 187, in catalog
    return self._get_catalog()
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro/framework/context/context.py", line 223, in _get_catalog
    conf_catalog = self.config_loader["catalog"]
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro/config/omegaconf_config.py", line 199, in __getitem__
    base_config = self.load_and_merge_dir_config(  # type: ignore[no-untyped-call]
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/kedro/config/omegaconf_config.py", line 339, in load_and_merge_dir_config
    for k, v in OmegaConf.to_container(
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/omegaconf.py", line 573, in to_container
    return BaseContainer._to_content(
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/basecontainer.py", line 292, in _to_content
    value = get_node_value(key)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/basecontainer.py", line 244, in get_node_value
    conf._format_and_raise(key=key, value=None, cause=e)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 231, in _format_and_raise
    format_and_raise(
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/_utils.py", line 899, in format_and_raise
    _raise(ex, cause)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/_utils.py", line 797, in _raise
    raise ex.with_traceback(sys.exc_info()[2])  # set env var OC_CAUSE=1 for full trace
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/basecontainer.py", line 242, in get_node_value
    node = node._dereference_node()
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 246, in _dereference_node
    node = self._dereference_node_impl(throw_on_resolution_failure=True)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 277, in _dereference_node_impl
    return parent._resolve_interpolation_from_parse_tree(
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 584, in _resolve_interpolation_from_parse_tree
    resolved = self.resolve_parse_tree(
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 765, in resolve_parse_tree
    return visitor.visit(parse_tree)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/antlr4/tree/Tree.py", line 34, in visit
    return tree.accept(self)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar/gen/OmegaConfGrammarParser.py", line 206, in accept
    return visitor.visitConfigValue(self)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar_visitor.py", line 101, in visitConfigValue
    return self.visit(ctx.getChild(0))
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/antlr4/tree/Tree.py", line 34, in visit
    return tree.accept(self)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar/gen/OmegaConfGrammarParser.py", line 342, in accept
    return visitor.visitText(self)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar_visitor.py", line 301, in visitText
    return self._unescape(list(ctx.getChildren()))
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar_visitor.py", line 389, in _unescape
    text = str(self.visitInterpolation(node))
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar_visitor.py", line 125, in visitInterpolation
    return self.visit(ctx.getChild(0))
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/antlr4/tree/Tree.py", line 34, in visit
    return tree.accept(self)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar/gen/OmegaConfGrammarParser.py", line 921, in accept
    return visitor.visitInterpolationNode(self)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/grammar_visitor.py", line 158, in visitInterpolationNode
    return self.node_interpolation_callback(inter_key, self.memo)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 746, in node_interpolation_callback
    return self._resolve_node_interpolation(inter_key=inter_key, memo=memo)
  File "/Users/Livia_Pimentel/miniconda3/envs/cpib-models-test/lib/python3.10/site-packages/omegaconf/base.py", line 676, in _resolve_node_interpolation
    raise InterpolationKeyError(f"Interpolation key '{inter_key}' not found")
omegaconf.errors.InterpolationKeyError: Interpolation key 'path' not found
    full_key: _path
    object_type=dict
r
@Nok Lam Chan, if you look at the traceback; Kedro-viz tried to access the
catalog = context.catalog
and that's when the error is thrown
n
@LĂ­via Pimentel Can you confirm this runs fine when you do
kedro run
? This help us to narrow down the scope of the issue as kedro-viz mostly just get these data from
kedro
, if
kedro run
works we normally don't expect issue on
kedro-viz
side.
l
Hi, @Nok Lam Chan. Thanks again for your reply. It works fine with kedro run. We are running kedro run -p <PIPELINE_NAME> --env <ENVIRONMENT_NAME> successfully.
I'm not sure if it helps, but our settings.py is the following:
Copy code
"""Project settings. There is no need to edit this file unless you want to change values
from the Kedro defaults. For further information, including these default values, see
<https://docs.kedro.org/en/stable/kedro_project_setup/settings.html>."""

# from kedro_mlflow.framework.hooks import MlflowHook

# Instantiated project hooks.
from cpib_models.hooks import ConfEnvironHooks, MLFlowRunHook, SparkHooks  # noqa: E402

from .settings_utils.resolvers import set_resolvers

# Hooks are executed in a Last-In-First-Out (LIFO) order.
HOOKS = (SparkHooks(), ConfEnvironHooks(), MLFlowRunHook())

# Installed plugins for which to disable hook auto-registration.
# DISABLE_HOOKS_FOR_PLUGINS = ("kedro-viz",)

from pathlib import Path  # noqa: E402

from kedro_viz.integrations.kedro.sqlite_store import SQLiteStore  # noqa: E402

# Class that manages storing KedroSession data.

SESSION_STORE_CLASS = SQLiteStore
# Keyword arguments to pass to the `SESSION_STORE_CLASS` constructor.
SESSION_STORE_ARGS = {"path": str(Path(__file__).parents[2])}

# Directory that holds configuration.
# CONF_SOURCE = "conf"

# Class that manages how configuration is loaded.
from kedro.config import OmegaConfigLoader  # noqa: E402

CONFIG_LOADER_CLASS = OmegaConfigLoader
# Keyword arguments to pass to the `CONFIG_LOADER_CLASS` constructor.
CONFIG_LOADER_ARGS = {
    "base_env": "base",
    "default_run_env": "local",
    "config_patterns": {
        "spark": ["spark*", "spark*/**"],
    },
}

set_resolvers()
resolvers.py
is the following:
Copy code
import os

import mlflow
from omegaconf import OmegaConf


def get_model_id():
    """
    Set run_folder
    """

    model_id = os.getenv("MODEL_ID")

    return model_id


def get_run_id(tracking_uri="databricks"):
    """
    Logic to get the run_id from running environment
    """
    run_id = os.getenv("RUN_ID")
    if not run_id:
        model_id = get_model_id()
        if model_id:
            if os.getenv("MODEL_VERSION"):
                run_id = (
                    mlflow.MlflowClient(tracking_uri=tracking_uri)
                    .get_model_version(
                        name=model_id, version=os.getenv("MODEL_VERSION")
                    )
                    .run_id
                )
            else:
                run_id = (
                    mlflow.MlflowClient(tracking_uri=tracking_uri)
                    .get_latest_versions(
                        name=model_id, stages=[os.getenv("STAGE", "Production")]
                    )[0]
                    .run_id
                )

    return run_id


def get_run_folder(tracking_uri="databricks"):
    """
    Set run_folder
    """
    from uuid import uuid4

    run_folder = None
    run_id = get_run_id()
    active_run = mlflow.active_run()

    if run_id:
        run_folder = (
            mlflow.MlflowClient(tracking_uri=tracking_uri)
            .get_run(run_id)
            .data.params.get("run_folder")
        )
    elif active_run:
        run_folder = active_run.data.params.get("run_folder")
    if run_folder is None:
        run_folder = os.getenv("RUN_FOLDER")
    if run_folder is None:
        run_folder = str(uuid4())

    return run_folder


def shift_current_month(months: int) -> str:
    """Generate string for the current month shifted 'monhts' month

    Args:
        months (int): Number of months to go backwards

    Returns:
        str: Shifted date
    """
    from datetime import datetime

    from dateutil.relativedelta import relativedelta

    date = (datetime.now().replace(day=1) - relativedelta(months=months)).strftime(
        "%Y-%m-%d"
    )
    return date


def get_packege_version(pkg="") -> str:
    """Get package version"""
    from pip._internal.commands.show import search_packages_info

    version = next(search_packages_info([pkg])).version
    return version


# %%
def set_resolvers():
    """
    Set the resolvers for OmegaConf
    """
    if not OmegaConf.has_resolver("env"):
        OmegaConf.register_new_resolver(
            "env",
            lambda key, default=None: os.getenv(key, default),
        )
    if not OmegaConf.has_resolver("shift_current_month"):
        OmegaConf.register_new_resolver(
            "shift_current_month",
            shift_current_month,
        )
    if not OmegaConf.has_resolver("package_version"):
        OmegaConf.register_new_resolver(
            "package_version",
            get_packege_version,
        )
đź‘€ 1
r
thank you, we will have to investigate this further. I will create a github issue on this and we can look at it soon.
👍 1
thankyou 1
j
was an issue ever opened @Rashida Kanchwala?
r
hey created an issue now https://github.com/kedro-org/kedro-viz/issues/2142. Thanks for checking in. @Lívia Pimentel, we wil look at it this week, might need to jump on a quick call if we are not able to reproduce the error at our end. will reach out 🙂
l
Ok! Thank you, @Rashida Kanchwala !
giving an update, the issue was solved when running
kedro viz run --include-hooks
Thank you everyone and @Ankita Katiyar and @Sajid Alam 🙂 🙌
❤️ 2