Importance Sampling

Short Definition

Importance sampling is a technique that weights samples to correct or emphasize parts of a data distribution.

Definition

Importance sampling is a statistical method in which samples are drawn from one distribution but weighted to reflect another target distribution. In machine learning, it is used to focus learning or estimation on more informative, rare, or high-impact samples while maintaining unbiased estimates under certain conditions.

Importance sampling changes how much each sample matters, not which samples exist.

Why It Matters

Uniform sampling treats all data points equally, even when some contribute more to learning or evaluation objectives. Importance sampling can improve efficiency, reduce variance, and correct sampling mismatches—especially under class imbalance or distribution shift.

When applied correctly, it accelerates learning without changing the underlying objective.

How Importance Sampling Works

Each sample is assigned a weight proportional to the ratio between the target distribution and the sampling distribution. These weights adjust the contribution of each sample during training or estimation.

Weights compensate for non-uniform sampling.

Minimal Conceptual Example

			
# conceptual importance weighting
loss = weight * compute_loss(sample)

Common Use Cases

Importance sampling is commonly applied in:

handling class imbalance
rare event detection
off-policy evaluation and reinforcement learning
Monte Carlo estimation
correcting covariate shift
accelerating convergence during training

Its flexibility makes it widely applicable.