Neural Network Lexicon

F1 Score

Short Definition

The F1 score balances precision and recall into a single metric.

Definition

The F1 score is the harmonic mean of precision and recall. It provides a single value that reflects both the correctness of positive predictions and the model’s ability to find all positive cases.

The F1 score is commonly used when precision and recall are both important and when class imbalance is present.

Why It Matters

Accuracy can be misleading on imbalanced datasets. The F1 score offers a more informative measure by penalizing extreme imbalances between precision and recall.

A high F1 score indicates that a model achieves a good balance between false positives and false negatives.

How It Works (Conceptually)

Precision measures prediction correctness
Recall measures coverage of positives
The harmonic mean penalizes extreme values
Both metrics must be reasonably high

The F1 score discourages models that optimize only one metric.

Mathematical Definition

F1 = 2 × (Precision × Recall) / (Precision + Recall)

Minimal Python Example

f1 = 2 * (precision * recall) / (precision + recall)

Common Pitfalls

Using F1 without understanding precision and recall
Assuming F1 is universally optimal
Ignoring task-specific cost asymmetry
Comparing F1 scores across different datasets

Related Concepts

Precision
Recall
Evaluation Metrics
Confusion Matrix
Class Imbalance
Decision Thresholding