Is it possible to modify which hooks are active ba...
# questions
h
Is it possible to modify which hooks are active based on the env? so my hook is only active when
kedro run --env=aws_batch
but on other envs? I am looking at implementing a mechanism for this in the settings.py, but i dont know how to modify the active hooks in the config loader class, the registered hooks dont seem to be accessible there.
d
so I actually think this happens too late in the lifecycle, but what you could do is use the
KEDRO_ENV
environment variable way of setting envs and pick that up in the hook?
h
yeah, but then the hook is responsible for picking up when to be active, which would be counterintuitive
basically, i want to log errors in pipelines when running on aws batch, send the logs and traceback to chatgpt, and then send chatgpt’s analysis to a slack channel
but, i dont want that to happen everytime i run a pipeline locally, only when i specify. and there could be different scenarios for when i want to disable the hook. id rather just change the HOOKS variable in the settings.py then accomodate for every scenario in the hook
also because i use aws batch for deployment, i dont use a .env file or env variables, everything is set in the kedro run command
d
I guess a dirty solution here is that AWS batch will have environment variables that are only present online
and you could use that as your disabling logic
h
thanks! there are indeed a bunch of workarounds. I saw there is a DISABLE_HOOKS_FOR_PLUGINS, that made me think there might be some hook activation logic i can hook into.
for example a list of hooks in the configloader, or kedrosession, or cli
d
unless your hook is included in a plug-in that wouldn’t work
but you could look at how
kedro-telemetry
works as a super simple plugin which can be disabled that way
h
but do you know where the hooks are “stored”?
my guess would be the kedrosession, then ill start looking there
d
h
okay, after a little bit of digging, the hooks are registrered in the KedroSession object, which is created in the run method of the cli. We can access the registrered hooks on the session object, and deregister hooks there. Also the env and other run params are available there.
Copy code
def run(
    tag,
    env,
    runner,
    is_async,
    node_names,
    to_nodes,
    from_nodes,
    from_inputs,
    to_outputs,
    load_version,
    pipeline,
    config,
    conf_source,
    params,
):

    ....
    with KedroSession.create(env=env, extra_params=params) as session:
        context = session.load_context()
        runner_instance = _instantiate_runner(runner, is_async, context)

        session._hook_manager.unregister() #<--------- Here we can unregister

        session.run(
            tags=tag,
            runner=runner_instance,
            node_names=node_names,
            from_nodes=from_nodes,
            to_nodes=to_nodes,
            from_inputs=from_inputs,
            to_outputs=to_outputs,
            load_versions=load_version,
            pipeline_name=PIPELINE_NAME,
        )
So im thinking the least dirty solution is to have the conditions detailing when to
unregister
which hooks in the settings.py file, and have a very minimal function in the CLI.py that executes this.
i can register this logic with the config_loader_class, and execute the logic in the
run
method of the CLI, that would be the most kedro-nic implementation right?
d
Yes - so this is a private method, so it will work and I’d like to flag this to the developers BUT the one caveat with private methods is if we change this in the future it won’t be counted as a breaking change.
h
okay, i think id like to make the hookplugin public at some point if it proves to be usefull for my clients. but the mechanism for enabling and disabling it conditionally is something ill maintain (untill kedro implements a mechanism like a hooks.yaml that can de edited in the config/settings)
d
Yeah I’ve asked the question to the team, we’re still a bit thin on resources coming back from the holiday but I think it’s a really good point
if you have time a GitHub issue explaining your usecase would be invaluable
h
okay, il finish this implementation first, and circle back. for now im going with a method on the configloader that returns a list of disabled hooks given the params passed to the run method
🚀 1
K 1
n
if you don't want to touch the private method, could you just do a no-op conditional of the run env? (effectively unregistering the hook)
h
my solution now looks like this: in CLI.py def run, after with KedroSession.create(env=env, extra_params=params) as session:
Copy code
run_args = extract_function_params(run, locals())

        for hook in context.config_loader.disable_hooks(run_args):
            session._hook_manager.unregister(name=hook._name_)
where
Copy code
def extract_function_params(func, local_vars):
    """
    Extracts the parameters of a given function based on its signature.
    Returns a copy of these parameters to ensure immutability.
    """
    param_names = func.callback.__code__.co_varnames[
        : func.callback.__code__.co_argcount
    ]
    return {
        param: copy.deepcopy(local_vars[param])
        for param in param_names
        if param in local_vars
    }
and in settings.py
Copy code
class OmegaConfigLoader(OmegaConfigLoader):
    def __init__(self, *args, **kwargs):
        kwargs["runtime_params"] = kwargs.get("runtime_params")
        super().__init__(*args, **kwargs)

    def disable_hooks(self, run_params: dict) -> list:
        """run_params are:
        tag,
        env,
        runner,
        is_async,
        node_names,
        to_nodes,
        from_nodes,
        from_inputs,
        to_outputs,
        load_version,
        pipeline,
        config,
        conf_source,
        params
        """
        disabled_hooks = []

        if run_params["env"] != "aws_batch":
            disabled_hooks.append(ErrorAnalysisHook)

        return disabled_hooks
one very obvious downside to this approach is that the hook will be initialised when the context gets created, so if the hook requires certain configs that are only accesible in the scenario you want the hook to run in, then you’ll have to deal with some try/except error handling which does make it a bit brittle (because there of course can be mis-configuration)
👍 1
d
thanks for posting your update
it’s a very good and sophisticated solution, but I’d like to make this easier in the future
thanks for being a Kedroid kedroid @Hugo Evers!
🙏 1