I remember asking myself this almost 2 years ago :...
# questions
m
I remember asking myself this almost 2 years ago 🙂 : is it possible to store the parameters of every run somwhere (versioned) - basically manifest the memory dataset of the parameters file. Couldn’t find anything here: https://docs.kedro.org/en/stable/configuration/parameters.html#parameters
h
Someone will reply to you shortly. In the meantime, this might help:
d
This is basically what Kedro-mlflow does it
But you can use hooks to do it yourself
m
I see - is that not a super common use-case if i run pipelines (locally)
d
It is super common, many people use mlflow for this
👍 1
But you can also use before_pipeline_run hook to generate a json file or something
👍 1
m
I will try to implement this, but I get the feeling that this could be so much easier if it was just a named dataset in the DataCatalogue. I strictly do not want to use mlflow as I am building a minimal example
d
It gets complicated because as soon as you’re in a multi-person team this stuff needs to be shared and synchronized
A Kedro hook is your best bet I think
Just write the session id and associated parameters to a file that you append to
m
ok understood
Would I need to save it myself, or can i interact with the Catalogue?
d
so you can do either
but you'd have to use the catalog Python API to do this
m
Ah this really helps!
Thanks
Alternative solution from grand maestro Antony is also good:
Copy code
def save_parameters(parameters: dict[str, Any]) -> dict[str, Any]:
    return parameters
Just define a node that goes first
d
yup that works!
the only thing that's missing is the session_id
so it's not linked to a run
but that works well
you could also log the git hash
n
Would also recommend
kedro-mlflow
, doesn't need to be deployed it can be run locally easily. Alternatively if you just want to save this as a JSON or something like that, use Kedro's hook
m
Hm I see - tbh I really liked the node solution because it seems so natural
👍🏼 1
I have actually gone back to the hook solution. I can get the session id easily, but how do I access the parameters?
run_params
seems to be different
n
run_params is what you provided to override
you will just get it from
catalog["parameters"]
I think
m
@Merel @datajoely: wrote up my experience on this task in the following issue, hopefully this is a good datapoint: https://github.com/kedro-org/kedro/issues/4571
👍🏼 1