Neural Network Lexicon

Batch Normalization

Short Definition

Batch normalization normalizes activations using batch statistics.

Definition

Batch normalization standardizes layer activations based on the mean and variance of each mini-batch. This reduces internal distribution shifts during training and stabilizes gradient flow.

In addition to normalization, batch normalization introduces learnable scale and shift parameters, allowing the network to retain expressive power.

Why It Matters

Batch normalization enables faster, more stable training and higher learning rates.

How It Works (Conceptually)

Normalize activations per batch
Apply learned scale and shift
Reduce sensitivity to initialization

Minimal Python Example

normalized = (x - batch_mean) / batch_std

Common Pitfalls

Using very small batch sizes
Applying incorrectly during inference
Confusing with input normalization

Related Concepts

Normalization
Training Stability
Deep Networks