Neural Network Lexicon

Self-Paced Learning

Short Definition

Self-paced learning is a training strategy where the model adaptively selects easier examples first and gradually incorporates harder ones.

Definition

Self-paced learning is a curriculum-based training approach in which the model itself determines the order in which training examples are introduced, based on its current competence. Examples that incur lower loss or higher confidence are prioritized early, while more difficult examples are incorporated as the model improves.

The curriculum is model-driven, not predefined.

Why It Matters

Manually defining difficulty can be brittle or domain-specific. Self-paced learning removes the need for explicit difficulty heuristics by letting the model’s learning state guide data exposure. This can stabilize early training and reduce sensitivity to noisy or hard examples.

Self-paced learning adapts the curriculum dynamically.

How Self-Paced Learning Works

A typical self-paced learning loop:

Train the model on the full dataset
Measure per-sample loss or confidence
Select samples below a difficulty threshold
Update the threshold as training progresses
Gradually include harder examples

The selection criterion evolves with the model.

Minimal Conceptual Example

			
# conceptual self-paced selection
selected = samples[loss(samples) < threshold]
train(model, selected)
threshold = update(threshold)

Self-Paced Learning vs Curriculum Learning

Curriculum learning: difficulty defined externally
Self-paced learning: difficulty inferred from the model

Self-paced learning is a form of adaptive curriculum learning.

Self-Paced Learning vs Hard Example Mining

Self-paced learning: emphasizes easy examples first
Hard example mining: emphasizes hard examples

They represent opposite sampling pressures.

Benefits

Potential benefits include:

improved training stability
reduced impact of noisy labels
faster early convergence
automatic difficulty estimation
reduced need for domain heuristics

Benefits depend on loss behavior and noise structure.

Risks and Limitations

Self-paced learning can:

reinforce early model biases
delay exposure to rare but important cases
under-train on complex decision boundaries
fail if loss does not reflect true difficulty

Adaptive does not mean unbiased.

Relationship to Optimization

Self-paced learning reduces gradient variance early in training by excluding high-loss samples. This can smooth optimization but may slow later-stage learning if thresholds are poorly managed.

Threshold schedules are critical.

Relationship to Generalization

Self-paced learning may improve robustness to label noise but does not guarantee better generalization. Evaluation must still be conducted on unbiased, representative test sets.

Common Pitfalls

using loss-based difficulty with miscalibrated models
freezing difficulty thresholds too early
combining with aggressive regularization
evaluating on self-selected data
omitting selection rules in reporting

Transparency is essential.

Related Concepts