Evolutionary Prediction Games¶

Conference: NEURIPS2025 arXiv: 2503.03401 Code: To be confirmed Area: Other Keywords: evolutionary game theory, prediction algorithms, user feedback loops, competitive exclusion, coexistence mechanisms, fairness

TL;DR¶

This paper proposes the "Evolutionary Prediction Games" framework, applying evolutionary game theory to analyze feedback loops between prediction algorithms and user populations. It shows that ideal learners lead to competitive exclusion (survival of the fittest), whereas practical learners (with finite data, surrogate losses, or overparameterization) can instead foster stable coexistence and mutualism among groups.

Background & Motivation¶

When prediction algorithms serve multiple user groups, prediction accuracy tends to differ across groups. If users increase participation when predictions are accurate (e.g., inviting friends or using the service more frequently) and reduce participation otherwise, a feedback loop emerges: model deployment → population composition shift → model retraining → further population shift. This phenomenon is pervasive in recommendation systems (accurate recommendations attract more similar users), credit scoring (lower premiums attract more customers), and medical diagnosis (higher accuracy attracts more patients).

Existing literature (e.g., performative prediction) studies the influence of models on data distributions but focuses primarily on convergence and global properties, lacking systematic analysis of long-term evolutionary outcomes at the group level. The paper's core insight is that prediction accuracy is essentially a scarce resource over which different groups compete, making evolutionary game theory a natural modeling framework.

Core Problem¶

How does the long-term composition of user groups evolve under repeated training and deployment of prediction algorithms?
Under what conditions does "survival of the fittest" (competitive exclusion) occur, and under what conditions can multi-group coexistence be achieved?
What are the fundamental differences in group-level evolutionary outcomes between ideal and practical learners?
How can dynamics-aware learning algorithms be designed to actively stabilize coexistence equilibria?

Method¶

Formal Definition of Evolutionary Prediction Games¶

Consider \(K\) user groups, where each group \(k\) has a fixed data distribution \(D_k\), population proportion \(p_k\), and the overall data distribution is a mixture \(D_{\bm{p}} = \sum_k p_k D_k\). A learning algorithm \(\mathcal{A}\) trains a classifier \(h\) on the mixture distribution, and the evolutionary fitness of each group is defined as its marginal prediction accuracy:

\[F_k(\bm{p}) = \mathbb{E}_{h \sim \mathcal{A}(\bm{p})}[\text{acc}_k(h)]\]

The tuple \(F(\bm{p}) = (F_1(\bm{p}), \dots, F_K(\bm{p}))\) defines an Evolutionary Prediction Game. A key feature is the implicit interaction among groups: each group's fitness depends on its own members' accuracy, yet the classifier is trained on the combined data of all groups.

Nash Equilibria and Evolutionary Dynamics¶

Population proportions evolve according to the replicator equation:

\[\dot{p}_k = p_k (F_k(\bm{p}) - \bar{F}(\bm{p}))\]

where \(\bar{F}(\bm{p}) = \sum_{k'} p_{k'} F_{k'}\) is the global average fitness. At equilibrium, all surviving groups achieve equal fitness, which is no lower than that of any extinct group.

Theorem 1: Competitive Exclusion under Oracle Classifiers¶

When an oracle classifier with unlimited data and computation is used: - Monotone accuracy: Overall accuracy increases over time, \(\frac{d}{dt} \text{acc}_{\bm{p}}(h_{\bm{p}}) \geq 0\) - Existence of stable equilibria: At least one stable equilibrium exists - Competitive exclusion: All stable equilibria are single-group dominant (only one group survives) - Instability of coexistence: Multi-group coexistence equilibria may exist but are all unstable

The key technical tool in the proof is potential games: the optimality of the oracle classifier implies that overall accuracy \(\text{acc}_{\bm{p}}(h_{\bm{p}})\), as a pointwise maximum of linear functions, is convex; convex functions on a simplex attain local maxima only at vertices, corresponding to single-group dominance.

Three Mechanisms for Coexistence¶

The paper demonstrates how three "imperfections" of practical learning can instead promote stable coexistence:

1. Surrogate Loss: Replacing the 0-1 loss with hinge loss introduces bias toward minority classes. When the minority classes of two groups are complementary, the biases cancel at a mixed equilibrium, yielding an optimal outcome. Theorem 2 constructs a concrete instance in which SVMs (hinge loss + \(\ell_2\) regularization) induce a stable, fitness-maximizing mixed equilibrium.

2. Finite Data: Finite samples introduce estimation error that breaks the strict optimality of the oracle classifier, causing the fitness function to be no longer strictly convex and thus permitting stable mixed equilibria.

3. Interpolation / Overparameterization: The tendency of overparameterized neural networks to memorize training data introduces implicit bias. When label noise across groups is complementary, the label noise of a larger group is "diluted" by the clean samples of a smaller group, creating a natural balancing mechanism.

Stabilization Algorithm¶

For unstable coexistence equilibria \(\bm{p}^*\) under ideal classifiers, the paper proposes retraining via a "virtual state" to achieve stabilization:

\[\mathcal{A}'(\bm{p}) = \mathcal{A}^{\text{opt}}(2\bm{p}^* - \bm{p})\]

That is, when the actual state is \(\bm{p}\), the classifier is trained with samples reweighted according to \(2\bm{p}^* - \bm{p}\). This reverses the direction of the dynamics, transforming an unstable equilibrium into a stable attractor.

Connection to Fairness¶

The paper proves an elegant proposition: at any Nash equilibrium of an evolutionary prediction game, the classifier satisfies overall accuracy equality in expectation — all surviving groups receive equal prediction accuracy. However, this "fairness" may be illusory: some groups may have already been competitively excluded, and fairness holds only among the remaining groups.

Key Experimental Results¶

Experiment 1: CIFAR-10 Data Augmentation Coexistence¶

Setup: Group A uses original CIFAR-10 images; Group B uses horizontally flipped images; classifier is ResNet-9
Results: Two stable single-group equilibria achieve 92.6±0.1% accuracy; the unstable coexistence equilibrium achieves 93.5±0.1% (higher)
The stabilization algorithm successfully converts the coexistence equilibrium to a stable one, realizing mutualism

Experiment 2: MNIST Label Noise Stable Coexistence¶

Setup: Group A is biased toward even digits; Group B toward odd digits (4:1 imbalance); majority class has 20% label noise
Results: An overparameterized CNN naturally produces a stable coexistence equilibrium with test accuracy 80.4±0.2% (theoretical upper bound: 84%)
Training accuracy is 98.7±0.1%, confirming the presence of interpolation
Key observation: the fitness function "flips" — the minority group achieves higher accuracy

Experiment 3: ACSIncome Three-Group Fairness¶

Setup: \(K=3\), corresponding to income prediction data from California, New York, and Texas; linear SVM
Results: Starting from a uniform distribution, the New York group is excluded at \(t \approx 316\); California and Texas then continue competing
Inter-group accuracy gap narrows from ~2% initially to ~0.2%, but only because the "weakest" group has been eliminated
Reveals the fundamental limitations of static fairness metrics

Highlights & Insights¶

Interdisciplinary innovation: Importing the competitive exclusion principle from evolutionary game theory into the analysis of user feedback loops in machine learning provides a fundamentally new theoretical perspective.
Theoretical depth: Rigorous competitive exclusion theorems are established via potential games and convex analysis, with an elegant proof (convex functions attain extrema at simplex vertices).
Counter-intuitive insight: "Flaws" in learning algorithms (surrogate losses, finite data, overparameterization) are precisely the mechanisms that enable group coexistence, challenging the conventional pursuit of perfect learners.
Stabilization algorithm: A concise and practical reweighting method (Proposition 2) is proposed to convert unstable coexistence into stable coexistence.
Evolutionary perspective on fairness: Current fairness may be the result of historical competitive exclusion; the paper introduces a framework for counterfactual fairness reasoning.

Limitations & Future Work¶

Strong assumptions on group structure: Groups are assumed to be non-overlapping with fixed within-group distributions; in practice, group boundaries are fuzzy and within-group distributions also drift.
Accuracy-only fitness: Real user participation is influenced by multiple factors (price, habit, social pressure, etc.) beyond prediction accuracy.
No analysis of convergence speed: Theorem 1 describes limiting behavior, but real-world deployment cycles may be far from equilibrium.
No exogenous factors: External events (e.g., marketing campaigns, policy interventions) affecting group proportions are not considered.
Stabilization algorithm requires knowledge of the target equilibrium \(\bm{p}^*\): This may be difficult to estimate precisely in practice.
Limited experimental scale: All three experiments involve two to three groups in simplified settings; behavior for large \(K\) remains unexplored.

Aspect	Ours	Performative Prediction	Long-term Fairness Literature
Modeling framework	Evolutionary game theory	Distribution mapping	Markov decision processes
Object of analysis	Group proportion dynamics	Distributional convergence	Group qualification distributions
Core tools	Potential games, replicator equation	Lipschitz continuity	Causal inference
Equilibrium concept	Evolutionarily stable strategy (ESS)	Fixed-point distribution	Fair steady state
Unique contribution	Competitive exclusion theorem + coexistence mechanisms	Retraining convergence guarantees	Intervention strategy design

Compared to Dean et al. (2024) on participation dynamics, this paper provides a more complete evolutionary game-theoretic foundation and systematically analyzes multiple mechanisms for coexistence. Compared to Hashimoto et al. (2018) on minimizing loss for minority groups, this paper not only addresses the protection of marginalized groups but also reveals the possibility of mutualism among groups.

Inspirations & Connections¶

System design implications: Before deploying ML systems, feedback dynamics among user groups should be assessed to anticipate which groups may be competitively excluded.
New understanding of regularization: Data from other groups in a mixture serves as a form of "natural regularization"; each group's objective can be rewritten as \(\arg\max_h \text{acc}_k(h) + \lambda R(h)\), where \(R\) is contributed by the other groups.
Ecological analogy: Group dynamics in ML systems resemble species succession in ecosystems; "conservation" thinking from ecology can be transferred to ML system design.
Extensible directions: Incorporating user churn/acquisition dynamics from recommendation systems to extend the framework to more complex fitness functions.

Rating¶

Novelty: ⭐⭐⭐⭐⭐ (Evolutionary game-theoretic perspective is pioneering in ML feedback loop analysis)
Experimental Thoroughness: ⭐⭐⭐⭐ (Three experiments each validate core theory, but scale and number of groups are limited)
Writing Quality: ⭐⭐⭐⭐⭐ (Clear logic, tight integration of theory and experiments, excellent figures)
Value: ⭐⭐⭐⭐⭐ (Provides a foundational theoretical framework for the long-term societal impact of ML systems)