Short Definition
Variational Autoencoders (VAEs) are generative neural networks that learn a probabilistic latent representation of data, allowing the model to generate new samples by sampling from the learned latent distribution.
They combine neural networks with variational inference to learn structured latent spaces suitable for generative modeling.
Definition
A Variational Autoencoder extends the standard autoencoder by learning a probability distribution over latent variables instead of a single deterministic representation.
The encoder maps an input (x) to the parameters of a latent distribution:
[
(\mu, \sigma) = f_\theta(x)
]
Instead of producing a fixed latent vector, the model samples:
[
z \sim \mathcal{N}(\mu, \sigma^2)
]
The decoder then reconstructs the input:
[
\hat{x} = g_\phi(z)
]
Training optimizes the Evidence Lower Bound (ELBO):
[
\mathcal{L} =
\mathbb{E}{q(z|x)}[\log p(x|z)] – D{KL}(q(z|x) || p(z))
]
Where:
- (q(z|x)) = encoder distribution
- (p(x|z)) = decoder likelihood
- (D_{KL}) = Kullback–Leibler divergence
- (p(z)) = prior latent distribution (usually Gaussian)
Core Idea
Instead of compressing inputs into fixed latent vectors, VAEs learn a continuous latent distribution.
Conceptually:
Input Data
↓
Encoder
↓
Latent Distribution (μ, σ)
↓
Sample z
↓
Decoder
↓
Generated / Reconstructed Output
This allows the model to generate new data by sampling from the latent space.
Minimal Conceptual Illustration
Example with images:
Image
↓
Encoder
↓
Latent distribution (μ, σ)
↓
Sample latent vector z
↓
Decoder
↓
Reconstructed image
New images can be created by sampling new latent vectors:
z ~ N(0,1)
↓
Decoder
↓
Generated image
Why the KL Divergence Term Exists
The KL divergence forces the latent distribution to remain close to a standard normal prior:
[
p(z) = \mathcal{N}(0,1)
]
This ensures the latent space is:
- continuous
- smooth
- sampleable
Without this constraint, the model would behave like a regular autoencoder.
Reparameterization Trick
Sampling from a distribution normally breaks gradient-based learning.
VAEs solve this using the reparameterization trick:
[
z = \mu + \sigma \cdot \epsilon
]
where:
[
\epsilon \sim \mathcal{N}(0,1)
]
This makes the sampling operation differentiable.
Latent Space Properties
One of the major strengths of VAEs is the structure of the latent space.
Good VAE latent spaces allow:
- smooth interpolation
- controllable generation
- semantic structure
Example interpolation:
Latent vector A (dog image)
↓
interpolate
↓
Latent vector B (cat image)
The decoder produces a smooth transformation between outputs.
Applications
Variational Autoencoders are used in many areas of machine learning.
Generative Modeling
Generate new images, text, or data samples.
Representation Learning
Learn compact latent spaces useful for downstream tasks.
Anomaly Detection
Unusual samples often reconstruct poorly.
Data Imputation
VAEs can estimate missing values in datasets.
Molecular Design
Used to generate new chemical structures.
VAE vs Standard Autoencoder
| Property | Autoencoder | VAE |
|---|---|---|
| Latent representation | Deterministic | Probabilistic |
| Generative capability | Limited | Strong |
| Latent space structure | Unconstrained | Regularized |
| Sampling ability | Difficult | Easy |
VAEs are explicitly designed for generative tasks.
Limitations
VAEs also have challenges.
Blurry Outputs
Reconstruction loss often produces blurred images.
Posterior Collapse
The decoder may ignore latent variables entirely.
Limited Sample Sharpness
Generative adversarial networks (GANs) often produce sharper images.
Importance in Deep Learning
VAEs were among the first deep generative models capable of learning structured latent spaces. They influenced many later generative architectures and remain widely used in probabilistic modeling and representation learning.
Summary
Variational Autoencoders are probabilistic generative models that learn structured latent spaces by combining neural networks with variational inference. By regularizing the latent distribution and enabling efficient sampling, VAEs allow models to generate new data and learn meaningful representations.
Related Concepts
- Autoencoders
- Latent Representations
- Representation Learning
- Generative Models
- KL Divergence
- Reparameterization Trick
- Self-Supervised Learning