Breaking the Gradient Barrier: Unveiling Large Language Models for Strategic Classification¶

Conference: NEURIPS2025 arXiv: 2511.06979 Code: To be confirmed Area: Robotics Keywords: Strategic Classification, in-context learning, Large Language Models, Bi-level Optimization, Gradient-free

TL;DR¶

This paper proposes GLIM (Gradient-free Learning In-context Method), which for the first time leverages the In-Context Learning (ICL) mechanism of LLMs to implicitly simulate the bi-level optimization in strategic classification (feature manipulation + decision rule optimization), enabling efficient strategic classification on large-scale data without any fine-tuning.

Background & Motivation¶

Strategic Classification (SC) studies how individuals modify their own features to obtain favorable classification outcomes, with typical scenarios including loan approval, college admissions, and phishing website detection. This problem is typically formulated as a bi-level optimization under a Stackelberg game:

Inner level (Strategic Manipulation): Upon observing the decision rule \(f\), individuals modify features \(\mathbf{x} \to \mathbf{x}'\) to maximize their utility subject to manipulation cost constraints.
Outer level (Decision Rule Optimization): The decision-maker designs a classification rule \(f^*\) that maintains high accuracy even after individuals' strategic manipulation.

Existing SC methods are almost exclusively based on linear models or shallow MLPs, validated only on small-scale datasets (e.g., Adult, Spam, with fewer than 50K samples). However, real-world scenarios in financial services and internet security involve millions or more dynamically evolving samples, and traditional methods cannot scale due to their reliance on gradient computation and repeated retraining.

LLMs possess the capacity to model high-dimensional dynamic inputs, but directly fine-tuning LLMs for SC is prohibitively expensive, while using them without fine-tuning makes it difficult to model the bi-level optimization structure — this constitutes the core challenge addressed in this paper.

Core Problem¶

How can ICL be used to simulate individual feature manipulation in strategic classification without fine-tuning the LLM?
How can ICL guide LLMs to adjust decision rules to counter strategic manipulation?
Can it be theoretically proven that the forward pass of ICL is equivalent to gradient descent optimization in traditional SC?

Method¶

Theoretical Foundation: ICL as Implicit Gradient Descent¶

Building on existing theory (Akyürek et al., Ahn et al.), this paper establishes that the forward pass of a linear self-attention layer can be interpreted as performing one step of gradient descent on a loss function:

\[y_\ell^{(n+1)} = -\langle x^{(n+1)}, w_\ell^{\text{gd}} \rangle\]

where weights are updated via implicit gradient steps: \(w_{\ell+1}^{\text{gd}} = w_\ell^{\text{gd}} - A_\ell \nabla R_{w_\star}(w_\ell^{\text{gd}})\).

GLIM: Bi-level Implicit Gradient Optimization¶

Inner Level — Strategic Manipulation Simulation (Proposition 1):

In traditional SC, individuals solve for the optimal feature perturbation via gradient descent:

\[\Delta \mathbf{x}_j^{\text{GD}} = A \cdot \eta(1 - y_j) W^\top\]

This paper proves that there exist pretrained self-attention weight matrices \(\mathbf{P}, \mathbf{V}, \mathbf{K}\) such that the ICL-induced feature update satisfies:

\[\Delta \mathbf{x}_j^{\text{ICL}} = \mathbf{P}\mathbf{V}\mathbf{K}^\top \mathbf{q}_j = \Delta \mathbf{x}_j^{\text{GD}}\]

That is, the forward pass of the LLM can exactly reproduce the feature manipulation produced by traditional gradient descent. Positive-class individuals (\(y_i=1\)) have no manipulation incentive, while negative-class individuals implicitly complete feature modification through the attention mechanism.

Outer Level — Decision Rule Optimization (Proposition 2):

Traditional SC optimizes decision weights \(W\) via cross-entropy loss, producing prediction updates \(\Delta \hat{y}_j^{\text{GD}} = \Delta W \cdot \mathbf{x}_j'\). This paper likewise proves that self-attention parameters can be constructed such that:

\[\Delta \hat{y}_j^{\text{ICL}} = \mathbf{P}\mathbf{V}\mathbf{K}^\top \mathbf{q}_j = \Delta \hat{y}_j^{\text{GD}}\]

That is, ICL can simulate the outer-level decision rule optimization without updating any parameters.

Practical Pipeline: Labeled samples \(\{(\mathbf{x}_i', y_i)\}\) are provided as a prompt to the LLM, and new samples serve as query tokens. The LLM implicitly completes the bi-level optimization through the forward pass of self-attention and outputs the classification result. The entire process requires no fine-tuning and directly invokes pretrained LLM APIs (e.g., GPT-4o).

Strategic Transparency¶

A classical assumption in SC is that the classification rule is transparent to individuals. LLMs adjust their self-attention layers based on contextual information (e.g., "which features are more sensitive," "how the decision boundary is defined"), ensuring that LLM-based SC methods likewise maintain strategic transparency.

Key Experimental Results¶

Datasets¶

Large-scale: CISFraud (financial fraud detection), PhiUSIIL (phishing URL detection), Synthetic (PaySim transaction simulation)
Small-scale: Adult (income prediction), Spam (spam email), Credit (credit scoring)

Main Results (Accuracy under Strategic Setting)¶

Method	PhiUSIIL	CISFraud	Adult	Spam
Linear Model	63.20%	63.61%	77.10%	89.67%
MLP	65.65%	65.04%	78.74%	91.05%
GLIM (DeepSeek-V3)	85.10%	84.62%	86.22%	94.85%
GLIM (GPT-4o)	86.50%	86.89%	91.35%	95.97%
GLIM (Claude-3.7)	85.07%	84.98%	88.58%	94.50%

Validation Results¶

Inner-level validation: The cosine similarity between ICL-induced feature updates and gradient descent converges to comparable values, with L2 distance approaching zero.
Outer-level validation: Cosine similarity in decision rule optimization gradually rises to approximately 0.95, with L2 distance stabilizing around 0.1.
Loss curves: ICL and gradient-based methods exhibit similar cross-entropy descent trends; on large-scale data, GLIM's loss reduction even surpasses that of traditional methods.
Scalability: Lightweight models show unstable performance as data volume grows, while GLIM maintains consistent scalability.

Highlights & Insights¶

Pioneering contribution: The first work to introduce LLMs with ICL into the strategic classification domain, bridging SC and LLM research.
Theoretical rigor: Constructively proves that the forward pass of ICL is equivalent to gradient descent in the bi-level optimization of SC, providing complete theoretical analysis for both inner and outer levels.
No fine-tuning required: Directly utilizes pretrained LLM APIs, avoiding the high cost of large-model fine-tuning and naturally suited to rapid adaptation in dynamic environments.
Substantial performance gains: Surpasses traditional methods by 20+ percentage points on large-scale datasets, demonstrating strong scalability from small to large scale.
Multi-model validation: Effectiveness verified across GPT-4o, Claude-3.7, DeepSeek-V3, Mixtral, Gemini, Qwen3, LLaMA, and other LLMs.

Limitations & Future Work¶

Theory limited to the linear regime: The proofs of Propositions 1 and 2 are both grounded in linear self-attention and linear classifier assumptions; although experiments suggest effectiveness in nonlinear settings, rigorous theoretical guarantees for the nonlinear case are absent.
API inference costs: While fine-tuning costs are avoided, LLM API call expenses and latency at scale remain practical deployment bottlenecks.
Prompt design sensitivity: ICL performance is highly dependent on the selection and format of in-context examples; the paper does not thoroughly analyze the impact of prompt engineering.
Single-round game assumption: Only a single-round Stackelberg game is considered; long-term strategic dynamics between individuals and decision-makers across multiple rounds are not explored.
Privacy risks: Sending individual feature data as prompts to LLM APIs raises data privacy concerns, which the paper does not address.

Dimension	Traditional SC (Linear/MLP)	GLIM (Ours)
Model form	Linear model / shallow neural network	Pretrained LLM
Optimization	Explicit gradient descent	ICL implicit gradient
Retraining required	Yes (required after distribution shift)	No (forward inference only)
Large-scale data support	Poor (computationally infeasible)	Good (consistent scalability)
OOD generalization	Not supported	Supported
Nonlinear modeling	Supported via MLP	Natively supported

Relationship to Performative Prediction: SC is a special case of performative prediction; future work may extend GLIM to the broader performative framework.

Broader Insights: - ICL as an optimizer: Interpreting ICL's forward pass as implicit gradient descent provides a theoretical foundation for LLM-based alternatives to a wider range of traditional optimization problems. - Game theory × LLMs: This paper opens a new direction for LLM applications in game theory and mechanism design, with close connections to work on auction mechanism design. - Real-world security applications: Scenarios such as phishing website detection and financial fraud countermeasures can directly benefit from this method.

Rating¶

Novelty: 9/10 (First use of LLM+ICL for strategic classification; the theoretical bridge is both novel and substantive.)
Experimental Thoroughness: 8/10 (Multi-model, multi-dataset validation is comprehensive, but ablation studies and prompt sensitivity analysis are lacking.)
Writing Quality: 8/10 (Theoretical derivations are clear, structure is well-organized, and details are sufficient.)
Value: 8/10 (Opens a new research direction, though practical deployment is constrained by API costs and privacy concerns.)