Bayesian Optimization - Objective Function Model Plot Explained

3 Ansichten (letzte 30 Tage)
Katarina Vuckovic
Katarina Vuckovic am 12 Apr. 2023
Kommentiert: the cyclist am 13 Apr. 2023
Hello,
Can someone help me interpret the the Bayesian Optimization Plot? What are all the different things plotted here. Specifically the items in the legend mean (obesrvation points, next point...etc).

Antworten (1)

the cyclist
the cyclist am 12 Apr. 2023
Bearbeitet: the cyclist am 12 Apr. 2023
Step 1: Understand the purpose of the objective function
Your ultimate goal, if you are using this plot, is presumably to build a mathematical model that is the most accurate one possible (meaning that if you apply that model on a new set of data, it will perform the best).
The objective function measures model performance. Specifically, the model with the smallest value of the objective function is your best guess at the best model.
So, from lots and lots of possible models, we are trying to minimize the objective function.
Step 2: Understand hyperparameters of a model
When you have an ensemble of models (i.e. different models with different parameters to fit), you not only have to fit each model, but there are hyperparameters that have to do with the machine process itself. An exhaustive definition of hyperparameters probably not that useful here. A good explanation can be found on the wikipedia page for hyperparameters.
The point is that to find the best possible model, we need to "tune", or optimize, these hyperparameters. In your first figure above, box and sigma are the hyperparameters.
Step 3: Understand the optimization process
In that same figure, each blue circle represents one run of a model, that occurs at a chosen value of the hyperparameters. That one model run gives a value of the objective function (i.e. an estimate of the model performance), for that combination of (box,sigma).
MATLAB runs models, one after another, at different values of the hyperparameters (each time calculating the objective function, and therefore an estimate of model performance). Each blue circle is a model result, at one (box,sigma) location.
As MATLAB runs more and more models, it creates the red surface, which is its best estimate for any pair of (box,sigma) values.
Step 4: Find the (estimated) minimum value of the objective function
After MATLAB has run many models, at many (box,sigma) values, you can use the lowest point on the red surface as an estimate of the minimum value of the objective function (i.e the best model), and also know which values of the hyperparameters will give that best model.
Addenda:
  • The contour plot at the bottom of the figure is just a different way of displaying the same info.
  • If the models you are building have only one hyperparameter, MATLAB will not display this 3-d plot
  • Similarly, if your model has more than two hyperparameters, it will not show this plot (unless you select 2 to display, I believe)
  • The black circle in the plot (while MATLAB is actually calculating, and updating the plot), is the model it is running at the moment (which will turn blue when it is done and has become "observed")
  • The red asterisk indicates the minimum I mentioned in Step 4, the estimated best feasible model
  2 Kommentare
Katarina Vuckovic
Katarina Vuckovic am 13 Apr. 2023
Bearbeitet: Katarina Vuckovic am 13 Apr. 2023
Thank you. Follow up question, how do you interpret the "model mean"?
the cyclist
the cyclist am 13 Apr. 2023
I can't think of a reason why one would use the "model mean" surface. I suppose it might be useful to know if the minimum of that surface is a very shallow surface (i.e. a wide range of hyperparameter gives nearly equal model performance), or if there seem to be multiple local minima.
In practice, I've only ever used the estimated "best so far" objective function.

Melden Sie sich an, um zu kommentieren.

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!

Translated by