Short Definition
Bayesian optimization uses probabilistic models to guide hyperparameter search.
Definition
Bayesian optimization is a sequential hyperparameter optimization technique that builds a probabilistic model of the relationship between hyperparameters and model performance. It uses this model to select promising configurations to evaluate next.
Unlike grid or random search, Bayesian optimization balances exploration and exploitation.
Why It Matters
Training models can be expensive. Bayesian optimization reduces the number of required evaluations by focusing on areas of the search space likely to yield improvements.
It is especially useful for large models and costly training runs.
How It Works (Conceptually)
- Train a surrogate model (e.g., Gaussian process)
- Estimate uncertainty over performance
- Select the next configuration using an acquisition function
- Update the surrogate model with new results
Each step improves the search strategy.
Minimal Python Example
next_params = propose_based_on_posterior()
Common Pitfalls
- High overhead for simple problems
- Misconfigured surrogate models
- Assuming guaranteed optimal results
- Ignoring noise in evaluation metrics
Related Concepts
- Hyperparameter Optimization
- Random Search
- Uncertainty Estimation
- Optimization
- Training Cost