https://kedro.org/ logo
#questions
Title
# questions
r

Ricardo Araújo

03/02/2023, 6:26 PM
Hi y'all! Say I have a very standard pipeline like this:
get-data -> train-model -> evaluate-model
. Now, the model can be any of sklearn's models, all with the same interface. What I'd like to do is, from a list of models specified in
parameters
, run many instances of this pipeline each with one model of the list (of course, I'd like pipelines to run in parallel). I can use modular pipelines to instantiate the pipeline many times, but I'm not sure how to use the model list in the parameters file. Any ideas?
Data is the same for all models.
v

Vassilis Kalofolias

03/03/2023, 8:24 AM
I think this is what you need: https://github.com/datajoely/modular-spaceflights/tree/main/src/modular_spaceflights/pipelines/modelling Check also the function
new_modeling_pipeline
here .
In this example, the list of model types is hardcoded in pipeline_registry. If you want this to be read from the parameters, I assume that you need to use a hook
after_catalog_created
to make sure that parameters are already parsed when
pipeline_registry
runs (haven't done this though).
r

Ricardo Araújo

03/03/2023, 11:52 PM
Thanks @Vassilis Kalofolias! That was helpful. I managed to read from the parameters by programmatically importing it using
ConfigLoader
.
2 Views