Stride and Padding

Short Definition

Stride controls how far a convolutional or pooling window moves across the input, while padding controls how input boundaries are handled during the operation.

Definition

Stride specifies the step size with which a kernel is applied across an input tensor, determining the degree of downsampling.
Padding specifies whether—and how—additional values are added around the input borders to control output size and boundary effects.

Stride sets resolution; padding sets coverage.

Why It Matters

Stride and padding jointly determine:

output spatial dimensions
information retention vs downsampling
effective receptive field growth
boundary bias and edge behavior
computational cost

Small choices have large architectural consequences.

Stride

What Stride Does

Stride defines how many input units the kernel skips between applications.

stride = 1: dense coverage
stride > 1: downsampling

Stride compresses space by skipping positions.

Effects of Increasing Stride

Increasing stride:

reduces output resolution
increases effective receptive field faster
lowers computation
risks losing fine-grained information

Downsampling trades detail for efficiency.

Padding

What Padding Does

Padding adds values (typically zeros) around the input border before applying the operation. It controls how kernels interact with edges.

Padding protects boundaries.

Common Padding Types

Valid padding: no padding; output shrinks
Same padding: output spatial size preserved
Zero padding: pads with zeros (most common)
Reflect / replicate padding: mirrors or repeats edge values

Padding encodes boundary assumptions.

Minimal Conceptual Illustration

Input + Padding → Sliding Kernel (Stride s) → Output

Output Size Intuition

For a 1D illustration:

Output size ≈ (Input + 2×Padding − Kernel) / Stride + 1

Stride and padding jointly define geometry.

Relationship to Receptive Fields

Larger stride increases receptive field growth per layer
Padding preserves alignment and coverage at edges
Excessive stride can create blind spots

Context grows faster with stride—but less precisely.

Stride vs Pooling

Stride and pooling both downsample, but:

stride is learnable when used in convolutions
pooling is fixed and non-learnable

Modern architectures often prefer strided convolutions.

Boundary Effects and Bias

Without padding:

edge pixels contribute less
border information is underrepresented

Padding mitigates edge bias at the cost of artificial values.

Stride and Padding in Modern Architectures

Modern CNNs and hybrids:

use stride early for efficient downsampling
apply padding to preserve alignment
rely on careful scheduling of resolution changes
combine with normalization and residual connections

Geometry is designed, not incidental.

Common Pitfalls

excessive early downsampling
ignoring boundary bias
mixing stride and pooling redundantly
assuming padding is neutral
misaligning feature maps across skip connections

Spatial design errors compound.

Summary Characteristics

Aspect	Stride	Padding
Primary role	Downsampling	Boundary handling
Learnable	Indirectly	No
Affects resolution	Yes	Indirectly
Risk	Information loss	Boundary artifacts

Related Concepts

Architecture & Representation
Convolution Operation
Pooling Layers
Receptive Fields
Feature Maps
Dilated Convolutions
Residual Networks (ResNet)
Vision Architectures