Counterfactual Logging

Short Definition

Counterfactual logging is the practice of recording not only the action taken by a model, but also the alternative actions that could have been taken and their associated contexts.

Definition

Counterfactual logging captures information needed to reason about what would have happened if a different decision had been made. Instead of logging only the chosen prediction or action, systems log additional metadata—such as action probabilities, candidate rankings, or randomized alternatives—enabling causal and off-policy evaluation.

Without counterfactuals, causality cannot be reconstructed.

Why It Matters

Once a model is deployed, its decisions influence which outcomes are observed. Outcomes for actions not taken remain unobserved, creating selection bias. Counterfactual logging provides the data foundation needed to estimate causal effects, evaluate alternative policies, and mitigate feedback-loop bias.

You cannot evaluate what you never observe.

What Is Logged

Depending on the system, counterfactual logs may include:

the action taken
alternative actions considered
action probabilities or propensities
model scores or rankings
context features at decision time
timestamps and policy version

Logs encode decision uncertainty.

Minimal Conceptual Illustration

Context → {Action A (taken), Action B (not taken), Action C (not taken)}
↑ probabilities logged

Relationship to Causal Evaluation

Counterfactual logging enables causal evaluation by supporting:

inverse propensity scoring
off-policy evaluation
counterfactual risk estimation
unbiased comparison of policies

Causal claims require counterfactual data.

Relationship to Feedback Loops

Feedback loops censor outcomes for actions not taken. Counterfactual logging helps expose this censoring by preserving information about suppressed alternatives.

Logs break feedback opacity.

Role in Online vs Offline Evaluation

Offline evaluation typically lacks counterfactuals. Online systems that log propensities or randomized actions can later perform offline causal analysis without rerunning experiments.

Logging turns online actions into offline evidence.

Use in Policy Evaluation

Counterfactual logs allow teams to:

evaluate new models without deployment
simulate alternative thresholds or policies
compare ranking strategies
audit past decisions retrospectively

Decisions can be re-evaluated safely.

Requirements and Constraints

Effective counterfactual logging requires:

some degree of randomness or exploration
stable policy identifiers
careful data storage and privacy handling
sufficient coverage of alternative actions

Purely deterministic systems cannot log counterfactuals.

Risks and Limitations

increased system complexity
higher logging and storage costs
incomplete coverage of action space
sensitivity to logging errors
misuse without causal expertise

Bad logs create false confidence.

Common Pitfalls

logging scores without action probabilities
changing policies without version tracking
assuming counterfactual validity without exploration
ignoring bias introduced by partial logging
treating logged alternatives as true outcomes

Counterfactuals are estimates, not facts.

Relationship to Outcome-Aware Evaluation

Outcome-aware evaluation asks whether outcomes improved. Counterfactual logging enables attribution—determining whether improvements were caused by the model or by external factors.

Outcomes need explanations.

Role in Evaluation Governance

Governance should define:

when counterfactual logging is required
minimum logging standards
audit procedures for logged data
acceptable use cases and limitations

Causal evidence requires disciplined logging.

Summary Characteristics

Aspect	Counterfactual Logging
Purpose	Enable causal inference
Data captured	Taken + untaken actions
Dependency	Exploration or randomness
Evaluation role	Foundational
Complexity	High

Related Concepts

Generalization & Evaluation
Causal Evaluation
Feedback Loops
Outcome-Aware Evaluation
Online vs Offline Evaluation
Off-Policy Evaluation
Exploration vs Exploitation
Model Update Policies