
Short Definition
A receptive field is the region of the input that influences the activation of a particular neuron or feature in a neural network.
Definition
In neural networks—especially convolutional architectures—the receptive field of a unit refers to the subset of input space (e.g., pixels, time steps) that can affect that unit’s output. Receptive fields arise from the composition of convolution, pooling, stride, dilation, and depth across layers.
What a neuron can “see” defines what it can learn.
Why It Matters
Receptive fields determine the scale of patterns a network can capture. Small receptive fields capture local details; large receptive fields capture global context. Many model failures stem from mismatches between receptive field size and task requirements.
Context depends on reach.
Local vs Effective Receptive Fields
- Local receptive field: the immediate region covered by a single convolution kernel
- Effective receptive field: the actual region of input that meaningfully influences an output after multiple layers
Effective receptive fields grow with depth—but not always as much as expected.
Minimal Conceptual Illustration
Input → [small RF] → [larger RF] → [global RF]
How Receptive Fields Grow
Receptive fields expand through:
- stacking convolutional layers
- pooling or strided operations
- dilation (atrous convolutions)
- increased kernel sizes
Depth compounds locality.
Factors Affecting Receptive Field Size
Key architectural choices include:
- kernel size
- stride
- padding
- pooling window and stride
- dilation rate
- number of layers
Design choices accumulate.
Receptive Fields in CNNs
CNNs rely on progressively increasing receptive fields to:
- detect edges and textures early
- combine features into motifs
- recognize objects or scenes at deeper layers
Hierarchy builds meaning.
Receptive Fields vs Global Context
While deeper layers have larger receptive fields, they may still fail to capture true global context due to:
- diminishing influence of distant inputs
- bias toward central regions
- limited effective coverage
Large does not mean global.
Effective Receptive Field Phenomenon
Empirically, effective receptive fields tend to be:
- smaller than the theoretical maximum
- Gaussian-shaped around the center
- biased toward local information
Theory overestimates influence.
Relationship to Pooling and Stride
Pooling and stride accelerate receptive field growth but introduce information loss. Excessive downsampling can harm tasks requiring fine localization.
Speed trades off precision.
Receptive Fields and Task Alignment
Different tasks require different receptive field scales:
- edge detection: small
- object recognition: medium
- scene understanding: large
- long-range dependency tasks: global
Architecture must match task.
Receptive Fields vs Attention
Attention mechanisms explicitly model global interactions, complementing or replacing large receptive fields achieved through depth alone.
Attention bypasses locality constraints.
Common Pitfalls
- assuming deeper always means better context
- excessive downsampling early in the network
- ignoring effective vs theoretical receptive fields
- mismatching receptive field size to task scale
- relying on CNNs where global reasoning dominates
Seeing is not understanding.
Summary Characteristics
| Aspect | Receptive Fields |
|---|---|
| Defines | Input influence region |
| Grows with | Depth and downsampling |
| Limitation | Often smaller than expected |
| Task sensitivity | High |
| Design importance | Critical |
Related Concepts
- Architecture & Representation
- Convolution Operation
- Pooling Layers
- Convolutional Neural Network (CNN)
- Feature Maps
- Stride and Padding
- Dilated Convolutions
- Attention Mechanisms