Bayesian Optimization

Short Definition

Bayesian optimization uses probabilistic models to guide hyperparameter search.

Definition

Bayesian optimization is a sequential hyperparameter optimization technique that builds a probabilistic model of the relationship between hyperparameters and model performance. It uses this model to select promising configurations to evaluate next.

Unlike grid or random search, Bayesian optimization balances exploration and exploitation.

Why It Matters

Training models can be expensive. Bayesian optimization reduces the number of required evaluations by focusing on areas of the search space likely to yield improvements.

It is especially useful for large models and costly training runs.

How It Works (Conceptually)

  • Train a surrogate model (e.g., Gaussian process)
  • Estimate uncertainty over performance
  • Select the next configuration using an acquisition function
  • Update the surrogate model with new results

Each step improves the search strategy.

Minimal Python Example

next_params = propose_based_on_posterior()

Common Pitfalls

  • High overhead for simple problems
  • Misconfigured surrogate models
  • Assuming guaranteed optimal results
  • Ignoring noise in evaluation metrics

Related Concepts

  • Hyperparameter Optimization
  • Random Search
  • Uncertainty Estimation
  • Optimization
  • Training Cost