Receptive Fields

Short Definition

A receptive field is the region of the input that influences the activation of a particular neuron or feature in a neural network.

Definition

In neural networks—especially convolutional architectures—the receptive field of a unit refers to the subset of input space (e.g., pixels, time steps) that can affect that unit’s output. Receptive fields arise from the composition of convolution, pooling, stride, dilation, and depth across layers.

What a neuron can “see” defines what it can learn.

Why It Matters

Receptive fields determine the scale of patterns a network can capture. Small receptive fields capture local details; large receptive fields capture global context. Many model failures stem from mismatches between receptive field size and task requirements.

Context depends on reach.

Local vs Effective Receptive Fields

Local receptive field: the immediate region covered by a single convolution kernel
Effective receptive field: the actual region of input that meaningfully influences an output after multiple layers

Effective receptive fields grow with depth—but not always as much as expected.

Minimal Conceptual Illustration

Input → [small RF] → [larger RF] → [global RF]

How Receptive Fields Grow

Receptive fields expand through:

stacking convolutional layers
pooling or strided operations
dilation (atrous convolutions)
increased kernel sizes

Depth compounds locality.

Factors Affecting Receptive Field Size

Key architectural choices include:

kernel size
stride
padding
pooling window and stride
dilation rate
number of layers

Design choices accumulate.

Receptive Fields in CNNs

CNNs rely on progressively increasing receptive fields to:

detect edges and textures early
combine features into motifs
recognize objects or scenes at deeper layers

Hierarchy builds meaning.

Receptive Fields vs Global Context

While deeper layers have larger receptive fields, they may still fail to capture true global context due to:

diminishing influence of distant inputs
bias toward central regions
limited effective coverage

Large does not mean global.

Effective Receptive Field Phenomenon

Empirically, effective receptive fields tend to be:

smaller than the theoretical maximum
Gaussian-shaped around the center
biased toward local information

Theory overestimates influence.

Relationship to Pooling and Stride

Pooling and stride accelerate receptive field growth but introduce information loss. Excessive downsampling can harm tasks requiring fine localization.

Speed trades off precision.

Receptive Fields and Task Alignment

Different tasks require different receptive field scales:

edge detection: small
object recognition: medium
scene understanding: large
long-range dependency tasks: global

Architecture must match task.

Receptive Fields vs Attention

Attention mechanisms explicitly model global interactions, complementing or replacing large receptive fields achieved through depth alone.

Attention bypasses locality constraints.

Common Pitfalls

assuming deeper always means better context
excessive downsampling early in the network
ignoring effective vs theoretical receptive fields
mismatching receptive field size to task scale
relying on CNNs where global reasoning dominates

Seeing is not understanding.

Summary Characteristics

Aspect	Receptive Fields
Defines	Input influence region
Grows with	Depth and downsampling
Limitation	Often smaller than expected
Task sensitivity	High
Design importance	Critical

Related Concepts

Architecture & Representation
Convolution Operation
Pooling Layers
Convolutional Neural Network (CNN)
Feature Maps
Stride and Padding
Dilated Convolutions
Attention Mechanisms