Hey Team QQ regarding versioning I think I am clear regardin Kedro #questions

Hey Team, QQ regarding versioning. I think I am c...

Max S

12/09/2022, 10:26 AM

Hey Team, QQ regarding versioning. I think I am clear regarding versioned datasets. Searching the docs I could not find anything regarding versioned parameters. Given that I trigger a pipeline run, I create versioned datasets (if I choose to do so), but can I also create a versioned save of the used parameters (from one or more

yaml

files?) Or am I thinking about this the wrong way and there is a good reason that this is not possible? Thanks!

Deepyaman Datta

12/09/2022, 6:53 PM

There's no automated way to do this, but

parameters

is an automatically-created

MemoryDataSet

, and you can persist the data to some physical versioned data set. Alternatively, you can use something like

mlflow

to log parameters for runs.

Max S

12/12/2022, 10:30 AM

Thanks for the advice! Say I want to go for the first option: how do I do that? I seem to be unable to add it to the catalogue with a filepath:

Copy code

DataSetError: 
__init__() got an unexpected keyword argument 'filepath'.
DataSet 'parameters' must only contain arguments valid for the constructor of 'kedro.io.memory_dataset.MemoryDataSet'.

Deepyaman Datta

12/12/2022, 4:06 PM

You would have to create a node to write it to a dataset (e.g. with catalog entry name

saved_parameters

); you cannot control the

parameters

catalog entry itself.

Deepyaman Datta

12/12/2022, 4:07 PM

(well, I guess you can, by modifying the catalog Kedro creates and creating a

CachedDataSet

out of it or something, but unless you have a strong requirement for doing that, let's do it the simple way)

2 Views

Open in Slack

Previous Next