Precision–Recall Curve

Short Definition

A Precision–Recall (PR) curve shows the trade-off between precision and recall across decision thresholds.

Definition

The Precision–Recall curve plots precision against recall as the decision threshold of a classifier is varied. Unlike the ROC curve, which considers true and false positive rates, the PR curve focuses directly on positive-class performance.

PR curves are especially informative when dealing with class imbalance.

Why It Matters

In many real-world problems, the positive class is rare and false positives or false negatives carry different costs. In such settings, ROC curves can appear overly optimistic, while PR curves reveal meaningful performance trade-offs.

PR curves help practitioners choose thresholds that balance precision and recall according to application needs.

How It Works (Conceptually)

  • The model outputs scores or probabilities
  • A threshold is swept across possible values
  • Precision and recall are computed at each threshold
  • Points are plotted to form the PR curve

Better models maintain high precision at high recall.

Interpretation

  • Curves closer to the top-right indicate stronger performance
  • A steep drop in precision suggests many false positives at higher recall
  • The baseline depends on class prevalence

PR curves are sensitive to class imbalance.

Minimal Python Example

precision, recall = compute_pr(y_true, y_scores, threshold)


Common Pitfalls

  • Comparing PR curves across datasets with different class distributions
  • Ignoring the baseline precision
  • Using PR curves without specifying the operating threshold
  • Confusing PR curves with ROC curves

Related Concepts

  • Precision
  • Recall
  • F1 Score
  • ROC Curve
  • Class Imbalance
  • Decision Thresholding