https://kedro.org/ logo
#questions
Title
# questions
m

Marc Gris

08/21/2023, 7:09 AM
Hi everyone, I hope you’ve all had a nice week-end 🙂 I’m sure this is doable, but I haven’t found much “leads” in the documentation. How could I have a “*per-run log file*” that would be save as an artifact of the pipeline in the data dir ? Thx in advance Regards, Marc
🤔 1
m

marrrcin

08/21/2023, 7:30 AM
That’s a nice question!
I would like to see the answer 🤔
m

Marc Gris

08/21/2023, 8:50 AM
Oh… I must confess that I’m flattered that you find the question interesting 🙂 Let’s see if someone has a straightforward suggestion / solution !
d

datajoely

08/21/2023, 8:52 AM
I think this is easiest with a
after_pipeline_run
hook, you have access to the
session_id
m

Marc Gris

08/21/2023, 3:01 PM
Thx @datajoely for you answer and suggestion. But I remain a little bit confused. If I remember correctly,
session_id
is a datetime like
str
, correct ? If so… What am I supposed to do with it to access to the run’s log ? Thx in advance for you time, advice and patience 🙏🏼 🙂
d

datajoely

08/21/2023, 3:02 PM
oh we’re talking about the logging config 🤔
m

Marc Gris

08/21/2023, 5:00 PM
@datajoely can it be done via logging config ? 🙂 I’ve looked into the doc but haven’t found anything. And I confess that I didn’t have time to dive into rich’s docs… I was hoping that someone more advance would have a “quick & easy” solution 😜
d

datajoely

08/21/2023, 5:02 PM
so I was wondering about this then got sucked into something else
I wonder if you can set up the logging formatter to do this
but from a lifecycle point of view I think we initialise the logging config before the session is initialised
I really don’t know here
it’s a thin team today will see if others have any ideas tomorrow
m

Marc Gris

08/22/2023, 8:20 AM
🙏🏼 🙂
Hi @datajoely 🙂 I was going through my old conversation / threads and saw that this one was still “pending”. Any updates / insights ? Thx in advance, M
d

datajoely

11/03/2023, 9:11 AM
Good question - let me raise this
j

Juan Luis

11/03/2023, 9:19 AM
this technically is similar to log rotation right? https://docs.python.org/3/library/logging.handlers.html#rotatingfilehandler but not quite, because we'd want to have 1 file per run, and log rollover when the size of the log goes over a certain threshold...
one can force manual rollover https://stackoverflow.com/a/44636428/554319 but I don't think this is possible just tweaking the logging configuration 🤔
d

datajoely

11/03/2023, 9:24 AM
@Juan Luis are
OmegaConf
resolvers available in
logging.yml
? Could you do something like
Copy code
info_file_handler:
    class: logging.handlers.RotatingFileHandler
    level: INFO
    formatter: simple
    filename: info_${oc.env:KEDRO_SESSION_ID}.log
💡 1
j

Juan Luis

11/03/2023, 9:26 AM
this is a clever trick - but then if the filename uses the session id, there's no need to do manual rotation right? and also,
$KEDRO_SESSION_ID
wouldn't normally be set, or am I missing something?
d

datajoely

11/03/2023, 9:32 AM
yeah you’d have to set it in an
before_pipeline_run
hook
and I’m not sure if that would even work tbh
j

Juan Luis

11/03/2023, 9:33 AM
continuing with the resolver idea, could there be a
${kedro:session_id}
resolver that got it from the session?
d

datajoely

11/03/2023, 9:33 AM
this would be very very useful
especially if we also make the session_id configurable
man I’m so excited by what OmegaConf is able to do
m

Marc Gris

11/03/2023, 9:37 AM
🍿 meow popcorn Thanks guys for considering this question / request 🙂
j

Juan Luis

11/03/2023, 9:49 AM
how would such a resolver be defined though? resolvers are added to omegaconf on
settings.py
, it's unclear to me how to fetch the global Kedro session or context 🤔
d

datajoely

11/03/2023, 10:04 AM
omegaconf could fetch a ENV var that doesn’t exist yet
n

Nok Lam Chan

11/03/2023, 10:14 AM
Not sure if I missed anything - the standard library
logging
should support this and you just need to defined the filepath to save in
data
instead of the root directory.
d

datajoely

11/03/2023, 10:17 AM
how do you define the filepath dynamically for each run?
n

Nok Lam Chan

11/03/2023, 10:28 AM
So you also need to match
session_id
? Otherwise there are easy way to rotate the files (pretty sure it was asked a few months ago). Using
OmegaConfigLoader
may not be ideal because it is decoupled from config loader in 0.19, it will be just a simple
yaml.load
i

Ivan Danov

11/03/2023, 10:42 AM
@Marc Gris could you please share why you want to do this? And what log do you want to keep, the entire log from the logger or certain logs? What is the use case you are solving?
👍 1
m

Marc Gris

11/03/2023, 4:15 PM
Hi @Ivan Danov Thanks for your message. Actually, I had posted the question 2 month ago. Therefore, I hope you won’t mind if I can’t remember the exact use case… 😅 However, I can remember the “spirit”: Striving to apply MLOps best practices, it seemed to me relevant to keep track of as many thing as possible regarding my experiments etc… The log of a run being information rich, it naturally seemed to me worth persisting & versioning “individually”… Thanks again & Have a nice week-end, M.
d

datajoely

11/03/2023, 4:16 PM
if we added the session id to the actually log lines and switched to JSON logging, perhaps thats a good way of doing this? It would work nicely in ELK for example
j

Juan Luis

11/03/2023, 9:16 PM
yeah I think it's pretty reasonable to desire 1 log file per run 🤔 and so far seems like the best solution in Kedro is
kedro run > log_$(date).txt
i

Ivan Danov

11/06/2023, 9:53 AM
https://stackoverflow.com/questions/44635896/how-to-create-a-new-log-file-every-time-the-application-runs One can achieve that by doing what is suggested in this stackoverflow post ☝️. Best option is creating a custom logging handler, extending the built-in
RotatingFileHandler
, which will always create a new file. The path and the name of that file can be configured through
logging.yaml
. Making that file have the same name as the session id is not that easy indeed, but doable with a few hacks. I think the work on this will make it much easier once it's done: https://github.com/kedro-org/kedro/issues/1731
🙏🏼 2