could we have some additional documentation on hoo...
# questions
h
could we have some additional documentation on hooks and how to migrate them from 0.19 to 0.19? for example, i have a hook that uses chatgpt to summarise errors in pipelines and sends that alongside the logs to a slack channel. in order to access the logs i need to know where the log file is. this used to be accesible in the context in a after_context_created hook, however, now this this config_loader actually has not loaded the config so i cant retrieve the location of the logs. This is not documented over at https://docs.kedro.org/en/latest/api/kedro.framework.hooks.specs.KedroContextSpecs.html#kedro.framework.hooks.specs.[…]xtSpecs.after_context_created. I would probably love to find this over at https://docs.kedro.org/en/latest/hooks/introduction.html#hook-specifications, where each hook specification could be a hyperlink, which links to a page that documents what each entry hook actually has access to. (like linking directly to the specs e.g https://docs.kedro.org/en/latest/api/kedro.framework.hooks.specs.KedroContextSpecs.html#kedro.framework.hooks.specs.[…]xtSpecs.after_context_created, but with the paramets more fully documented, for example is the conf_creds actuallu had a pydantic type, instead of being type annotated with dict(str, any), one could easily see what it will contain). now in order to figure this out i use the specs page https://docs.kedro.org/en/latest/api/kedro.framework.hooks.specs.html#module-kedro.framework.hooks.specs to find the signature and then make a hook and enter debug mode to see what each kwargs actually contains. And so now i see that a after_catalog_created for example has the actual values of the params in the feed_dict, while one might expect to find such entries in the conf_catalog (or just a params kwarg). I am now on a similar hunt as to where the configuration variables have migrated to. My hunch is the conf_catalog. A
n
Thank for the docs suggestion, would you prefer to open a PR?
h
yeah, just one question, is the logging that is set in the conf no longer getting loaded?
n
On the other hand, I am not so sure what is breaking in hooks, can you explain a bit more?
I see
h
i see the configurations for mlflow and params and such
but not logging
image.png
i used to be able to find the path where the logs are stored in:
Copy code
self.log_file_path = context.config_loader["logging"]["handlers"]["file"][
                "filename"
            ]
n
Logging need to be loaded separately. ConfigLoader Made OmegaConfigLoader the default config loader. Removed ConfigLoader and TemplatedConfigLoader. logging is removed from ConfigLoader in favour of the environment variable KEDRO_LOGGING_CONFIG.
Sorry can't format it nicely on phone
Check the migration section
h
okay, so i see there is an environment variable, but regardless of whether that has been set, where can i find in the context of kedro what logging configuration is currently set? so if there is filebased logging, i want to know which file kedro is logging to, and if the default configuration is being used, there is no file so i wont send the logs (unless i want to route the logging to a temporary file but thats another matter)
👍🏼 1
n
I will update the notes later today, I think we have it in our head but we didn't communicate it well in the migration guide and it doesn't cover 100% what the RELEASE.md has. So file-based logging is opt-in now, by default it won't write to files because it causes issues on deployment, and many doesn't use it.
The default setting is in
kedro/framework/project/default_logging.yml
Copy code
version: 1

disable_existing_loggers: False

handlers:
  rich:
    class: kedro.logging.RichHandler
    rich_tracebacks: True
    # Advance options for customisation.
    # See <https://docs.kedro.org/en/stable/logging/logging.html#project-side-logging-configuration>
    # tracebacks_show_locals: False

loggers:
  kedro:
    level: INFO

root:
  handlers: [rich]
h
so i just added this to the CONFIG_LOADER_ARGS: “config_patterns”: { “logging”: [“logging*“, “logging*/**“, “**/logging*“], } However, logging doesnt show up as variable on the context, but for example mlflow does
i can see that it is added to the configloaders config_patterns
but it still is not getting loaded
n
https://kedro--3544.org.readthedocs.build/en/3544/resources/migration.html#logging
logging.yml
is now independent of Kedro’s run environment and used only if
KEDRO_LOGGING_CONFIG
is set to point to it. The documentation on logging describes in detail how logging works in Kedro and how it can be customised.
You will need to explicitly point the variable to the path. Logging is not included in
config_loader
anymore, part of the reason is we need the logging configured earlier than config loader and almost no one have environment specific
logging.yml
. The default location is in
conf/logging.yml
h
hmm, i think there might be a miscomminucation on my part then. what i am trying to do is access the logging configuration in a hook. i understand i can set the logging configuration using an environment variable, but regardless of how it has been set, i need to know in the hook what it has been set to. so if kedro logs to a file, i need to know its path. after context created is the earliest hook, so if logging is set up earlier thats fine, but i want to know how i can access kedro’s logging configuration inside of a hook to do error reporting.
👀 1
n
In that sense it doesn't exist in hook, but it can we read in a few ways: 1. Just read from the file again from
KEDRO_LOGGING_CONFIG
with a yaml.load call 2. It should be stored in
logging
module 3. use
LOGGING
which can be import from
kedro.framework.project
, it should be a dict like object so you can read from it Actually I think I prefer 3. but I haven't tested this myself
h
3. sounds like the best option, ill try
thanks!
n
let me know if it works! and thanks for the question it's a good shout, I don't think this is documented anywhere because we didn't antipicated user need this, but I think it is a valid use case
h
yes that works, although i do have to say that the environment variable is quite inconvenient when deploying, so i think ill look for a solution where i capture the std.out to a tempfile or something like that if i were to publish this as an actual hook.
👍🏼 1
n
although i do have to say that the environment variable is quite inconvenient when deploying
Can you elaborate on this?
h
when i set the conf and env for deployment, that sets everything at the same time. but now i have to set that one environment variable somewhere in either a build step or somewhere in python using os. the tricky part is that i try to maintain dev-prod parity as much as possible so i can debug prod locally. this is actually quite simple using vscode’s debug configurations. but if i use the environment variable in the build step, or some .env file locally, now this config doesn’t track with what environment im emulating. (or i need to also modify that in as sh script that then calls the kedro run). the other solution is that i modify the variable using the os in python, but since i use a modified runner for sending kedro nodes as jobs, i cant do it there, because the deployed job uses the default runner. so ill need to create a new runner, that activates when the env changes and then set the os variable(and that only works if it hooks in early enough)
👍🏼 1
but i understand it is a little bit niche, the default logging is fine, and ill just capture the std.out to a tempfile
and make that part of the hook to manage
👍🏼 1
n
That's fair, if it would automatic load
conf/logging.yml
(or CONF_BASE/logging), would it be sufficient? So environment variable is still possible, just that if you keep it in the default location you dont' have to specify it)
noted that stdout will not capture the Kedro run log
https://stackoverflow.com/questions/58971197/why-python-logging-writes-to-stderr-by-default I discussed this with someone else before, I wish
kedro run | grep <x>
would work, but by default python log are not stdout but stderr
h
i do this: screen -S “$SESSION_NAME” -X stuff “script -q -c ‘$KEDRO_RUN_COMMAND’ ‘$LOGS_DIRECTORY/$SESSION_NAME.log’ ”
but those files can get huge
i think the default should be that it reads from base if it is there, and merge/override with env/logging.yml is it is there
👀 1
and then you could have an environment variable if you like? but that seems a little redundant maybe?
👍🏼 1
i only know my usecase of course
n
https://github.com/kedro-org/kedro/issues/3446 If you have any opinion, I'd appreciate some comments on the issue. I am interested to improve this but this isn't the current priority yet.
i think the default should be that it reads from base if it is there, and merge/override with env/logging.yml is it is there
It will be override most likely, so it's more consistent if you migrate from 0.18 and other configuration that read from
config_loader
, plus user don't really get to see the
default_logging_config.yml
.
It may seem redundant, but IIRC there are some conflicting feature so this is what we come up with at the time. 1. logging isn't part of config loader anymore because we want it to be loaded earlier 2. configuration can be a "zip" file and read by the config loader (kedro package will ship the config as a compressed config + python project ) So in Case 2, if the logging file is inside the zip, you cannot read it without initialising the config loader file. It's likely an niche case and thus #3446 should be the default and make 95% case work. The environment variable is only needed if you need more control.
h
hmm, okay, in that case i would say an environment variable introduces another thing to manage. when i use kedro i specifically steer away from using environment variables for anything kedro related, instead i make it part of the configloader(for example using dynaconf). so for logging, if that is a very niche thing, and you would want to support usecases for hooks, then i would expose it using python in the configloader. but environment variables that point to paths of files is source of error for me, and something i try to avoid as much as possible
👍🏼 1