Short Definition
Gradient descent minimizes loss by iteratively updating parameters.
Definition
Gradient descent is an optimization method that adjusts weights and biases in the direction that reduces loss. It uses gradients to determine how parameters should change.
By repeatedly applying small updates, the model gradually improves its predictions.
Why It Matters
Gradient descent is the engine that drives learning in neural networks.
How It Works (Conceptually)
- Compute gradients
- Move parameters opposite the gradient
- Repeat until convergence
Minimal Python Example
Python
weight -= learning_rate * gradient
Common Pitfalls
- Learning rate too high or too low
- Expecting immediate convergence
- Ignoring noisy gradients
Related Concepts
- Learning Rate
- Loss Function
- Optimization