Pruning

Short Definition

Pruning removes unimportant parameters from a neural network.

Definition

Pruning reduces model size by eliminating weights or neurons that contribute little to performance. This simplifies the model while preserving most of its predictive power.

Pruning can be applied during training or as a post-training optimization step.

Why It Matters

Smaller models are faster, cheaper, and easier to deploy.

How It Works (Conceptually)

  • Identify low-importance parameters
  • Remove or zero them out
  • Optionally retrain to recover performance

Minimal Python Example

if abs(weight) < threshold:
weight = 0.0

Common Pitfalls

  • Pruning too aggressively
  • Skipping retraining
  • Assuming pruned models always improve accuracy

Related Concepts

  • Sparse Neural Networks
  • Model Compression
  • Efficiency