Short Definition
An activation function introduces non-linearity into a neural network.
Definition
Activation functions transform the raw output of a neuron into a form that allows neural networks to model complex patterns. Without activation functions, stacked layers would collapse into a single linear transformation.
By applying non-linear transformations, activation functions allow networks to approximate highly complex functions and decision boundaries.
Why It Matters
Without activation functions, deep neural networks cannot learn complex relationships, no matter how many layers they have.
How It Works (Conceptually)
- Applied after the weighted sum and bias
- Shapes the output signal
- Controls gradient flow during training
Minimal Python Example
Python
import mathdef sigmoid(x): return 1 / (1 + math.exp(-x))
Common Pitfalls
- Using sigmoid everywhere
- Ignoring vanishing gradients
- Forgetting activation in hidden layers
Related Concepts
- Neuron
- Forward Pass
- Vanishing Gradients