Hello, I have a question about the right solution ...
# questions
m
Hello, I have a question about the right solution for logging pipeline errors in log files. For example, in
spaceflights-pandas
we have
info_file_handler
defined which logs into
info.log
file, but when a DatasetError is raised (for example the dataset csv file is missing), it is not logged in info.log (traceback and error is visible only in console). How to make it be logged in the log file a well? I can always define a hook like this:
Copy code
class ErrorCatchLoggingHook:
    @hook_impl
    def on_pipeline_error(self, error: Exception):
        logger.exception(f"Pipeline failed due to an error: {str(error)}")
but then the error log in the console is duplicated.
r
Hey, I tried this myself and you’re right - the console was showing two error tracebacks: one from the hook using and one from
RichHandler
in the root logger. I was able to fix it by removing
rich
from
root.handlers
in
logging.yml
and keeping only
info_file_handler
.
👀 1
n
The info file handler serves a different purpose than the
rich
or
console
handler. Traceback is normally not part of the "logging", the things that you can see on a console is the systemhook (and modified by rich if you are using it).
If you absolutely don't want to see the traceback twice. You can 1. Create your own logger - make sure it doesn't propagate to
root
, the default will pass it (I forgot what's the configuration but you can google this) 2. Register the
info_file_handler
to the logger 3. use something like the hook you created with the custom logger
👍 1
m
okay thank you!