Hi Team Anyone ever played with hyperparameter tuning framew Kedro #plugins-integrations

Hi Team! Anyone ever played with hyperparameter t...

Guillaume Tauzin

02/10/2025, 4:45 PM

Hi Team! Anyone ever played with hyperparameter tuning frameworks within kedro? I have found several scattered pieces of info related to this topic, but no complete solutions. Ultimately, I think what I would like to set up is a way to have multiple nodes running at the same time and all contributing to the same tuning experiment. I would prefer using optuna and this is the way I would go about it based on what I have found online: 1. Create a node that creates an optuna study 2. Create N nodes that each run hyperparameter tuning in parallel. Each of them loads the optuna study and if using kedro-mlflow each hyperparameter trial can be logged into its own nested run. 3. Create a final nodes that process the results of all tuning nodes Does this sound reasonable to you? Has anyone produced such a kedro workflow already? I would love to see what it looks like. I am also wondering: • I am thinking of creating an OptunaStudyDataset for the optuna study . Has anyone attempted this already? • For creating N tuning nodes, I am thinking of using the approach presented on the GetInData blog post on dynamic pipelines. Would this be the recommended approach? Thanks!

Hall

02/10/2025, 4:45 PM

Someone will reply to you shortly. In the meantime, this might help:

Juan Luis

02/10/2025, 4:52 PM

for now the semi-official approach is the blog post you mentioned - how was that process by the way? any pros and cons you saw?

Juan Luis

02/10/2025, 4:52 PM

I think some folks have tried to use Optuna w/ Kedro in the past

🥳 1

Guillaume Tauzin

02/10/2025, 5:01 PM

Do you mean it is semi-official because there's not yet an official approach? Is there any discussion I could follow?

Guillaume Tauzin

02/10/2025, 5:02 PM

I have not tried implementing it yet, for now it seems reasonable to me but I am asking because I am trying to understand the pros and cons. Once I get to it, happy to give some feedback (and maybe even some simple code example).

Hugo Evers

02/12/2025, 8:00 AM

Hé I created a setup for this some time ago, where I use a optuna study dataset, and a yaml configuration loader so you can set all the trial parameters in your conf. If you’d like we can discuss?

Guillaume Tauzin

02/12/2025, 8:55 AM

Hi @Hugo Evers! Yes, that would be super nice, thank you!

Guillaume Tauzin

02/13/2025, 10:32 AM

@Juan Luis I just tried the dynamic pipeline setup. It's actually very similar to what I have been doing so far except I use native YAML inheritance instead of the OmegaConfLoader merge resolver with the custom

_overrides

. (BTW, they appear when you do

kedro catalog list

). I feel it looks much neater. Is there any drawback doing it that way? Let me give you an example: Blog post parameter file:

Copy code

study_params:
  study_name: test
  load_if_exists: true
  direction: maximize
  n_trials_per_process: 10

price_predictor:
  _overrides:
    study_name: price_predictor_base
  study_params: ${merge:${study_params},${._overrides}}

  base:
    study_params: ${..study_params}

  candidate1:
    _overrides:
      study_name: price_predictor_candidate1
    study_params: ${merge:${..study_params},${._overrides}}

  candidate2:
    _overrides:
      study_name: price_predictor_candidate2
    study_params: ${merge:${..study_params},${._overrides}}

  candidate3:
    _overrides:
      study_name: price_predictor_candidate3
    study_params: ${merge:${..study_params},${._overrides}}

reviews_predictor:
  _overrides:
    study_name: reviews_predictor_base
  study_params: ${merge:${study_params},${._overrides}}

  base:
    study_params: ${..study_params}

  test1:
    _overrides:
      study_name: reviews_predictor_test1
    study_params: ${merge:${..study_params},${._overrides}}

Using the native YAML inheritance:

Copy code

study_params: &base_study_params
  study_name: test
  load_if_exists: true
  direction: maximize
  n_trials_per_process: 10

price_predictor: 
  base: 
    study_params: &price_predictor_base_study_params
      <<: *base_study_params
      study_name: price_predictor_base

  candidate1:
    study_params:
      <<: *price_predictor_base_study_params
      study_name: price_predictor_candidate1

  candidate2:
    study_params:
      <<: *price_predictor_base_study_params
      study_name: price_predictor_candidate2

  candidate3:
    study_params:
      <<: *price_predictor_base_study_params
      study_name: price_predictor_candidate3

reviews_predictor:
  base: 
    study_params: &reviews_predictor_base_study_params
      <<: *base_study_params
      study_name: reviews_predictor_base

  candidate1:
    study_params:
      <<: *reviews_predictor_base_study_params
      study_name: reviews_predictor_test1

Happy to hear your thoughts on this!

Juan Luis

02/13/2025, 10:35 AM

It's actually very similar to what I have been doing so far except I use native YAML inheritance instead of the OmegaConfLoader merge resolver with the custom
_overrides
.

I do prefer the YAML merge keys version actually 😄 @marrrcin any thoughts?

marrrcin

02/27/2025, 8:32 AM

It will not work if you want to override some parts, no?

Guillaume Tauzin

02/27/2025, 8:59 AM

@marrrcin I am not sure I understand. 😅 Do you have an example?

marrrcin

02/27/2025, 2:44 PM

So if you have:

Copy code

study_params: &base_study_params
  study_name: test
  load_if_exists: true
  direction: maximize
  n_trials_per_process: 10

and then

Copy code

reviews_predictor:
  base: 
    study_params: &reviews_predictor_base_study_params
      <<: *base_study_params
      study_name: reviews_predictor_base

study_name

correctly overwritten? If so, then it's super cool!

Guillaume Tauzin

02/27/2025, 2:57 PM

I can confirm it is:

Copy code

yaml.safe_load(open("./test.yml"))
Out[5]: 
{'study_params': {'study_name': 'test',
  'load_if_exists': True,
  'direction': 'maximize',
  'n_trials_per_process': 10},
 'reviews_predictor': {'base': {'study_params': {'study_name': 'reviews_predictor_base',
    'load_if_exists': True,
    'direction': 'maximize',
    'n_trials_per_process': 10}}}}

Guillaume Tauzin

02/27/2025, 4:12 PM

Here is my

test.tml

Copy code

study_params: &base_study_params
  study_name: test
  load_if_exists: true
  direction: maximize
  n_trials_per_process: 10

reviews_predictor:
  base: 
    study_params:
      <<: *base_study_params
      study_name: reviews_predictor_base

Guillaume Tauzin

02/27/2025, 4:13 PM

I also think it's quite cool and I like the fact that it's native so completely kedro independent.

this 2

4 Views

Open in Slack

Previous Next