Exposure Bias (Deep Dive)

Short Definition

Exposure bias is the mismatch between training and inference conditions in sequence models, caused by using ground-truth inputs during training but model-generated outputs during inference.

Definition

Exposure bias arises in autoregressive sequence models when teacher forcing is used during training. At training time, the model conditions on the true previous token; at inference time, it conditions on its own predicted token. This discrepancy causes the model to be unprepared for its own mistakes, leading to error accumulation over long sequences.

The model never learns to recover from itself.

Why It Matters

Exposure bias is a fundamental issue in:

language modeling
machine translation
text generation
dialogue systems
time-series forecasting

Small early errors can compound, degrading output quality over time.

Autoregressive systems amplify minor mistakes.

Core Mechanism

During Training (Teacher Forcing)

Input_t = GroundTruth_{t-1}

During Inference

Input_t = ModelPrediction_{t-1}

This creates a distribution mismatch:

P_train(context) ≠ P_inference(context)

Training and deployment operate under different data distributions.

Minimal Conceptual Illustration

			
Training:
A → B → C → D  (true sequence)
Inference:
A → B̂ → Ĉ → ?

One incorrect token alters the entire future trajectory.

Error Accumulation

If probability of a correct token at each step is p:

Probability of perfect sequence of length T = p^T

Even high per-token accuracy can produce unstable long outputs.

Sequence errors compound exponentially.

Relationship to Teacher Forcing

Teacher forcing:

accelerates training
stabilizes gradients
but creates exposure bias

It solves one problem while introducing another.

Mitigation Strategies

1. Scheduled Sampling

Gradually replace ground-truth inputs with model predictions during training.

2. Professor Forcing

Align training and inference hidden state dynamics.

3. Reinforcement Learning Fine-Tuning

Optimize sequence-level objectives rather than token-level likelihood.

4. Data Augmentation

Train on partially corrupted sequences.

The goal: train the model under realistic rollout conditions.

Exposure Bias vs Label Bias

Concept	Exposure Bias	Label Bias
Occurs in	Autoregressive training	Structured prediction
Cause	Training–inference mismatch	Local normalization
Effect	Error compounding	Biased transitions

Different mechanisms, similar instability.

Impact on Evaluation

Models evaluated under teacher forcing:

may appear strong
hide inference-time instability
misrepresent real-world performance

Always evaluate with full autoregressive rollout.

Practical Warning Signs

High training accuracy but poor generation quality
Increasing incoherence in long outputs
Sensitivity to initial tokens
Drifting or repetitive sequences

Instability often reveals exposure bias.

Modern Perspective

Large language models still face exposure bias, but:

scale improves robustness
attention reduces compounding effects
fine-tuning with RLHF mitigates instability

Scale helps but does not eliminate the issue.

Common Pitfalls

evaluating only with teacher forcing
ignoring long-sequence performance
assuming exposure bias disappears in Transformers
conflating exposure bias with overfitting

Distribution mismatch is subtle.

Summary Characteristics

Aspect	Exposure Bias
Root cause	Training–inference mismatch
Primary domain	Autoregressive models
Symptom	Error compounding
Common mitigation	Scheduled sampling
Detection	Autoregressive evaluation

Neural Network Lexicon