Hi everyone I hope you ve all had a nice week end slightly s Kedro #questions

Hi everyone, I hope you’ve all had a nice week-end...

Marc Gris

08/21/2023, 7:09 AM

Hi everyone, I hope you’ve all had a nice week-end 🙂 I’m sure this is doable, but I haven’t found much “leads” in the documentation. How could I have a “*per-run log file*” that would be save as an artifact of the pipeline in the data dir ? Thx in advance Regards, Marc

🤔 1

marrrcin

08/21/2023, 7:30 AM

That’s a nice question!

marrrcin

08/21/2023, 7:30 AM

I would like to see the answer 🤔

Marc Gris

08/21/2023, 8:50 AM

Oh… I must confess that I’m flattered that you find the question interesting 🙂 Let’s see if someone has a straightforward suggestion / solution !

datajoely

08/21/2023, 8:52 AM

I think this is easiest with a

after_pipeline_run

hook, you have access to the

session_id

Marc Gris

08/21/2023, 3:01 PM

Thx @datajoely for you answer and suggestion. But I remain a little bit confused. If I remember correctly,

session_id

is a datetime like

str

, correct ? If so… What am I supposed to do with it to access to the run’s log ? Thx in advance for you time, advice and patience 🙏🏼 🙂

datajoely

08/21/2023, 3:02 PM

oh we’re talking about the logging config 🤔

Marc Gris

08/21/2023, 5:00 PM

@datajoely can it be done via logging config ? 🙂 I’ve looked into the doc but haven’t found anything. And I confess that I didn’t have time to dive into rich’s docs… I was hoping that someone more advance would have a “quick & easy” solution 😜

datajoely

08/21/2023, 5:02 PM

so I was wondering about this then got sucked into something else

datajoely

08/21/2023, 5:02 PM

I wonder if you can set up the logging formatter to do this

datajoely

08/21/2023, 5:02 PM

https://stackoverflow.com/questions/57204920/how-to-properly-format-the-python-logging-formatter

datajoely

08/21/2023, 5:03 PM

but from a lifecycle point of view I think we initialise the logging config before the session is initialised

datajoely

08/21/2023, 5:03 PM

I really don’t know here

datajoely

08/21/2023, 5:03 PM

it’s a thin team today will see if others have any ideas tomorrow

Marc Gris

08/22/2023, 8:20 AM

🙏🏼 🙂

Marc Gris

11/03/2023, 8:19 AM

Hi @datajoely 🙂 I was going through my old conversation / threads and saw that this one was still “pending”. Any updates / insights ? Thx in advance, M

datajoely

11/03/2023, 9:11 AM

Good question - let me raise this

Juan Luis

11/03/2023, 9:19 AM

this technically is similar to log rotation right? https://docs.python.org/3/library/logging.handlers.html#rotatingfilehandler but not quite, because we'd want to have 1 file per run, and log rollover when the size of the log goes over a certain threshold...

Juan Luis

11/03/2023, 9:20 AM

one can force manual rollover https://stackoverflow.com/a/44636428/554319 but I don't think this is possible just tweaking the logging configuration 🤔

datajoely

11/03/2023, 9:24 AM

@Juan Luis are

OmegaConf

resolvers available in

logging.yml

? Could you do something like

Copy code

info_file_handler:
    class: logging.handlers.RotatingFileHandler
    level: INFO
    formatter: simple
    filename: info_${oc.env:KEDRO_SESSION_ID}.log

💡 1

Juan Luis

11/03/2023, 9:26 AM

this is a clever trick - but then if the filename uses the session id, there's no need to do manual rotation right? and also,

$KEDRO_SESSION_ID

wouldn't normally be set, or am I missing something?

datajoely

11/03/2023, 9:32 AM

yeah you’d have to set it in an

before_pipeline_run

hook

datajoely

11/03/2023, 9:32 AM

and I’m not sure if that would even work tbh

Juan Luis

11/03/2023, 9:33 AM

continuing with the resolver idea, could there be a

${kedro:session_id}

resolver that got it from the session?

datajoely

11/03/2023, 9:33 AM

this would be very very useful

datajoely

11/03/2023, 9:34 AM

especially if we also make the session_id configurable

datajoely

11/03/2023, 9:34 AM

man I’m so excited by what OmegaConf is able to do

Marc Gris

11/03/2023, 9:37 AM

🍿 meow popcorn Thanks guys for considering this question / request 🙂

Juan Luis

11/03/2023, 9:49 AM

how would such a resolver be defined though? resolvers are added to omegaconf on

settings.py

, it's unclear to me how to fetch the global Kedro session or context 🤔

datajoely

11/03/2023, 10:04 AM

omegaconf could fetch a ENV var that doesn’t exist yet

Nok Lam Chan

11/03/2023, 10:14 AM

Not sure if I missed anything - the standard library

logging

should support this and you just need to defined the filepath to save in

data

instead of the root directory.

datajoely

11/03/2023, 10:17 AM

how do you define the filepath dynamically for each run?

Nok Lam Chan

11/03/2023, 10:28 AM

So you also need to match

session_id

? Otherwise there are easy way to rotate the files (pretty sure it was asked a few months ago). Using

OmegaConfigLoader

may not be ideal because it is decoupled from config loader in 0.19, it will be just a simple

yaml.load

Ivan Danov

11/03/2023, 10:42 AM

@Marc Gris could you please share why you want to do this? And what log do you want to keep, the entire log from the logger or certain logs? What is the use case you are solving?

👍 1

Marc Gris

11/03/2023, 4:15 PM

Hi @Ivan Danov Thanks for your message. Actually, I had posted the question 2 month ago. Therefore, I hope you won’t mind if I can’t remember the exact use case… 😅 However, I can remember the “spirit”: Striving to apply MLOps best practices, it seemed to me relevant to keep track of as many thing as possible regarding my experiments etc… The log of a run being information rich, it naturally seemed to me worth persisting & versioning “individually”… Thanks again & Have a nice week-end, M.

datajoely

11/03/2023, 4:16 PM

if we added the session id to the actually log lines and switched to JSON logging, perhaps thats a good way of doing this? It would work nicely in ELK for example

Juan Luis

11/03/2023, 9:16 PM

yeah I think it's pretty reasonable to desire 1 log file per run 🤔 and so far seems like the best solution in Kedro is

kedro run > log_$(date).txt

Ivan Danov

11/06/2023, 9:53 AM

https://stackoverflow.com/questions/44635896/how-to-create-a-new-log-file-every-time-the-application-runs One can achieve that by doing what is suggested in this stackoverflow post ☝️. Best option is creating a custom logging handler, extending the built-in

RotatingFileHandler

, which will always create a new file. The path and the name of that file can be configured through

logging.yaml

. Making that file have the same name as the session id is not that easy indeed, but doable with a few hacks. I think the work on this will make it much easier once it's done: https://github.com/kedro-org/kedro/issues/1731

🙏🏼 2

2 Views

Open in Slack

Previous Next