Gradient descent

Short Definition

Gradient descent minimizes loss by iteratively updating parameters.

Definition

Gradient descent is an optimization method that adjusts weights and biases in the direction that reduces loss. It uses gradients to determine how parameters should change.

By repeatedly applying small updates, the model gradually improves its predictions.

Why It Matters

Gradient descent is the engine that drives learning in neural networks.

How It Works (Conceptually)

  • Compute gradients
  • Move parameters opposite the gradient
  • Repeat until convergence

Minimal Python Example

Python
weight -= learning_rate * gradient

Common Pitfalls

  • Learning rate too high or too low
  • Expecting immediate convergence
  • Ignoring noisy gradients

Related Concepts

  • Learning Rate
  • Loss Function
  • Optimization