Precision–Recall Curve

Short Definition

A Precision–Recall (PR) curve shows the trade-off between precision and recall across decision thresholds.

Definition

The Precision–Recall curve plots precision against recall as the decision threshold of a classifier is varied. Unlike the ROC curve, which considers true and false positive rates, the PR curve focuses directly on positive-class performance.

PR curves are especially informative when dealing with class imbalance.

Why It Matters

In many real-world problems, the positive class is rare and false positives or false negatives carry different costs. In such settings, ROC curves can appear overly optimistic, while PR curves reveal meaningful performance trade-offs.

PR curves help practitioners choose thresholds that balance precision and recall according to application needs.

How It Works (Conceptually)

The model outputs scores or probabilities
A threshold is swept across possible values
Precision and recall are computed at each threshold
Points are plotted to form the PR curve

Better models maintain high precision at high recall.

Interpretation

Curves closer to the top-right indicate stronger performance
A steep drop in precision suggests many false positives at higher recall
The baseline depends on class prevalence

PR curves are sensitive to class imbalance.

Minimal Python Example

precision, recall = compute_pr(y_true, y_scores, threshold)

Common Pitfalls

Comparing PR curves across datasets with different class distributions
Ignoring the baseline precision
Using PR curves without specifying the operating threshold
Confusing PR curves with ROC curves

Related Concepts

Precision
Recall
F1 Score
ROC Curve
Class Imbalance
Decision Thresholding