Neural Network Lexicon

Feature Availability

Short Definition

Feature availability describes which input features are accessible at the moment a prediction is made.

Definition

Feature availability refers to the set of features that are legitimately known and computable at prediction time. It accounts for temporal constraints, system latency, data dependencies, and operational realities that determine whether a feature can be used without leaking future information.

A feature that exists in the dataset is not necessarily available at inference time.

Why It Matters

Using unavailable features leads to data leakage, inflated offline performance, and brittle models that fail in production. Many real-world failures occur because features were engineered or selected without respecting when and how information becomes available.

Feature availability enforces causal validity.

What Determines Feature Availability

Availability is governed by:

event time vs processing time
label latency and verification delays
upstream system dependencies
aggregation windows and update frequency
real-time vs batch computation constraints
access permissions and privacy rules

Availability is a systems property, not just a data property.

Feature Availability vs Feature Existence

Feature existence: present in historical data
Feature availability: accessible at prediction time

Conflating the two introduces leakage.

Minimal Conceptual Example

			
# invalid (unavailable at prediction time)
feature = average(spend_over_next_30_days)
# valid (available at prediction time)
feature = average(spend_up_to_now)

Feature Availability in Training and Evaluation

Training and evaluation must:

restrict features to those available at prediction time
compute features using the same availability rules
exclude backfilled or future-derived values
align feature pipelines across offline and online settings

Mismatch here invalidates evaluation.

Relationship to Temporal Leakage

Violating feature availability is a primary cause of:

temporal feature leakage
processing-time leakage
validation leakage in time-dependent tasks

Availability errors are often subtle and pervasive.

Feature Availability and Model Design

Feature availability can influence:

model complexity and latency
achievable performance ceilings
retraining cadence
monitoring and alerting design

Sometimes the “best” feature cannot be used safely.

Common Pitfalls

selecting features based on offline correlation alone
using aggregates that span beyond prediction cutoff
assuming real-time access to batch-computed features
training with backfilled data without adjusting timelines
failing to document feature availability assumptions

Availability must be explicit and documented.

Relationship to Generalization

Models that rely on unavailable features appear to generalize well offline but collapse when deployed. Respecting feature availability yields more conservative but trustworthy generalization estimates.

Relationship to Evaluation Protocols

Evaluation protocols must enforce feature availability rules. Allowing unavailable features during validation or testing constitutes evaluation leakage.

Related Concepts