Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of Bayesian Optimization in Hyperparameter Tuning #374

Open
malihamashkoor123 opened this issue Mar 3, 2023 · 2 comments

Comments

@malihamashkoor123
Copy link

Is there a way to implement the Bayesian Optimization method of hyperparameter tuning in the (R-based) PLP pipeline?

@egillax
Copy link
Collaborator

egillax commented Mar 10, 2023

Hi @malihamashkoor123,

Can you tell more about your use case where you think Bayesian Optimization (BO) would help over grid search?

This is something that me and @jreps have been discussing but mostly for deep learning models where we can't afford to search through all combinations because it is so computationally expensive.

This is something I'm definitely interested in but adding it could be nontrivial. Currently the enumeration of the hyperparameter search space is done all upfront in each modelSettings function, either using grid search or random search. But for bayesian optimization that wouldn't work so the main training loop would need to be refactored to accommodate that.

There is also the question of which specific algorithm to use, is there a clear state-of-the art one? Are there accessible implementations through R packages available? Implementing such an algorithm from scratch is as well non-trivial.

If you have any experience in using BO I'd be interested in hearing about it.

@malihamashkoor123
Copy link
Author

Hi @egillax ,

Thank you for your reply.

I don't have any experience with the BO. However, as BO is an informed search method and is generally preferred over the grid search, therefore, I was mainly curious about how to implement it together with the other models (e.g Random Forest) in the PLP pipeline, and whether if BO's implementation can impart any difference in the performance of the models.

Additionally, considering the PLP pipeline and the complexity of the BO (as compared to the grid and the random search) I also believe that implementing the BO from the scratch in the PLP pipeline would be a better option as compared to using the available R-packages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants