Escaping Mode Collapse in LLM Generation via Geometric Regulation¶
Conference: ICML 2026
arXiv: 2605.00435
Code: None
Area: LLM Generation / Dynamical Systems / Decoding Control
Keywords: Mode Collapse, Geometric Collapse, Correlation Dimension, KV Cache Intervention, Low-Rank Damping
TL;DR¶
This work reinterprets "mode collapse" (repetition, looping, monotony) in LLM long-form generation from a dynamical systems perspective as "geometric collapse" of hidden state trajectories in representation space. It proposes RMR—a lightweight low-rank damping on the Transformer value cache to suppress the most persistent self-reinforcing directions, thereby maintaining stable, high-quality generation even in extremely low-entropy decoding regimes (0.8 nats/step).
Background & Motivation¶
Background: Failure in long-form decoding (repetition, looping, monotony) is a longstanding challenge for LLM deployment. Mainstream mitigation methods are all "token-level": top-k/top-p/temperature sampling, repetition penalty, locally typical sampling, etc., all modify the probability distribution of the next token.
Limitations of Prior Work: These approaches are essentially "local, symbolic-level" patches—under low temperature or low entropy targets (e.g., temperature 0.5, entropy target 1.0), models still tend to fall into loops; token-level heuristics only suppress symptoms, do not explain why loops systematically occur, and cannot provide controllable knobs for long-range dynamics.
Key Challenge: Mode collapse is not "the probability of a certain token is wrong," but "the entire generation process slides down a narrow path." An inherently "trajectory/long-range" problem is being addressed with "per-token/local" tools, which are naturally inadequate.
Goal: (1) Establish a geometric metric that directly characterizes long-range collapse; (2) Design a lightweight method that directly intervenes in internal states without altering the probability distribution.
Key Insight: Treat autoregressive decoding as a random trajectory in high-dimensional state space (states are KV cache or next-token log-prob vectors). Mode collapse ↔ trajectory gets trapped in a low-dimensional "quasi-attractor," i.e., "state space reachability collapse."
Core Idea: Use correlation dimension to quantify "reachability"; when strong self-reinforcing low-rank directions are detected (analogous to order parameters in Ising model phase transitions), apply low-rank damping on the value cache to slightly attenuate these directions, thereby restoring the trajectory's ability to explore the full space.
Method¶
Overall Architecture¶
The method has two layers. The first is diagnosis: The authors use a two-dimensional state-dependent IFS (Iterated Function System) as a minimal dynamical model, proving that when "temperature/inverse temperature \(\beta\)" crosses a critical \(\beta_0\), the system splits from a single ergodic invariant measure into two stable attractor domains—this is the geometric counterpart of mode collapse. Then, "finite-time correlation dimension" \(d_t\) (based on the scaling law \(C_t(\varepsilon)\propto\varepsilon^d\)) is measured online in real LLM decoding, using the sequence of next-token log-prob vectors as input. Experiments show that \(d_t\) drops significantly before and after loops appear, and is more robust than token-level entropy/Distinct-n.
The second layer is intervention RMR (Reinforced Mode Regulation): During decoding, within intervals, a bounded-spectrum generalized eigenvalue problem is solved on the recent value cache segment to identify low-rank subspaces with abnormally strong temporal persistence. Low-rank damping is then applied to the value cache, equivalent to applying a \((1-\eta)\) shrinkage to the historical mean \(m_t\) in the minimal model, generalized to high dimensions. The entire process does not alter softmax probabilities or logits; it is a pure state-space intervention.
Key Designs¶
-
Correlation Dimension as "Geometric Collapse" Probe:
- Function: Real-time estimation of the effective dimension of internal trajectories during decoding, serving as an early warning and evaluation metric for mode collapse.
- Mechanism: For trajectory \(\{x_t\}\), compute the correlation sum \(C_t(\varepsilon)=\frac{2}{t(t-1)}\sum_{i<j}\mathbf{1}(\|x_i-x_j\|<\varepsilon)\); the slope on a log-log plot of \(\varepsilon\) gives \(d_t\). The naive \(O(t^2)\) algorithm is improved to \(O(t)\) online update: \(C_{t+1}(\varepsilon)=\frac{t-1}{t+1}C_t(\varepsilon)+\frac{2}{t(t+1)}\sum_i\mathbf{1}(\|x_i-x_{t+1}\|<\varepsilon)\).
- Design Motivation: Traditional entropy/Distinct-n are "token-level" random variables with high variance per trajectory and hard-to-set thresholds; correlation dimension is a "trajectory-level" geometric invariant, directly capturing "trajectory trapping" and naturally aligning with the subsequent intervention target.
-
Persistent Direction Detection (Bounded-Spectrum Generalized Eigenvalue Problem):
- Function: Locate the few low-rank directions in high-dimensional value cache that are most self-reinforcing and slowest to dissipate—these are the high-dimensional analogs of \(m_t\) and need to be suppressed.
- Mechanism: On a sliding window of the value cache matrix, construct two covariance-like matrices (instantaneous vs. historical average), and solve for generalized eigenvectors; to avoid numerical explosion, use a bounded-spectrum form \(\lambda\in[0,1]\) to represent "persistence strength." Principled thresholding (selecting only the most significant directions far from the background spectrum) avoids harming normal semantic directions.
- Design Motivation: Damping all dimensions would harm language quality; suppressing only the "most persistent" few directions can break loop traps with minimal disruption, echoing the insight from Section 3.2's minimal model that even "weak damping" with \(\eta=10^{-4}\) suffices to restore reachability.
-
Value Cache Low-Rank Damping Update (RMR):
- Function: Subtract a small portion of the selected directions from the value cache in low-rank form as an inference-time intervention.
- Mechanism: Construct a low-rank projection \(P=\sum_i u_i u_i^\top\), and update the value cache as \(V \leftarrow V - \eta\, V P\), equivalent to \(m_t\leftarrow(1-\eta)m_t\) in high dimensions. This operation only adds a small matrix multiplication, with overhead comparable to or less than a single attention operation.
- Design Motivation: As this is a state intervention on the value cache, it does not affect the analytic form of token probability distributions and can be orthogonally combined with any sampler (top-p, temperature, contrastive decoding, etc.), making it deployment-friendly.
Loss & Training¶
RMR is an inference-time method, requiring no training, no fine-tuning, and no reward model. \(\eta\) and target low rank \(r\) are the only two hyperparameters; the authors recommend \(\eta\in[10^{-3},10^{-2}]\), \(r\in\{2,4,8\}\), which work for most models.
Key Experimental Results¶
Main Results¶
The authors test on multiple open-source LLMs (including Qwen3-4B-Base) using both "temperature-locked" and "entropy-locked" decoding protocols. The core metric is "non-collapse rate" (the proportion of samples without explicit loops in long-form generation).
| Decoding Setting | Baseline non-collapse | RMR non-collapse | Note |
|---|---|---|---|
| Temperature = 0.7 | 8% | 56% | Substantial improvement |
| Entropy target = 1.0 nats/step | 5% | 33% | Baseline nearly all collapse in low-entropy region |
| Entropy target ≈ 2.0 nats/step | Near saturation | Near saturation | Gap narrows at high entropy |
| Entropy target = 0.8 nats/step | Almost 0 | Still usable | RMR opens a new usable low-entropy region |
Ablation Study¶
| Configuration | Non-collapse Performance | Description |
|---|---|---|
| RMR full | Significant recovery | Detection + low-rank damping |
| Detection only, no damping | Comparable to baseline | Confirms "intervention" is necessary; diagnosis alone is ineffective |
| Full-dimension damping (not low-rank) | Text quality degrades | Shows the value of "minimal necessary intervention" |
| Token-level repetition penalty only | Limited improvement | Confirms symbolic methods fail in low-temperature regimes |
Key Findings¶
- Correlation dimension \(d_t\) drops significantly before explicit loops appear, serving as an early warning and is much more sensitive than entropy or Distinct-n.
- "Persistent directions" are very sparse (usually < 8 dimensions), confirming the intuition from the minimal model that the order parameter is low-dimensional, and explaining why low-rank damping suffices.
- RMR extends the usable decoding regime from ~2.0 nats/step down to ~0.8 nats/step, effectively unlocking a previously unusable "high certainty + high diversity" operational region due to looping.
Highlights & Insights¶
- Interdisciplinary Analogy: Connects LLM decoding with nonequilibrium statistical physics (Ising phase transitions, slow variables, self-organization); the correspondence between correlation dimension and order parameter is elegant—this "trajectory geometry" perspective is closer to the essence of the problem than just looking at token probabilities.
- Diagnosis–Intervention Loop: Correlation dimension qualitatively identifies "reachability collapse," then low-rank damping targets and resolves it; the entire chain is self-consistent, with the method derived from theory rather than ad hoc.
- Transferable Trick: "Low-rank/low-overhead intervention on the value cache" could apply to other long-range issues (hallucination drift, chain-of-thought collapse, agent repeatedly invoking the same tool)—all are "trajectory traps" in high-dimensional latent space.
Limitations & Future Work¶
- Experiments mainly focus on open-ended text generation and the Qwen3 series, without sufficient coverage of reasoning/agent/code or other highly structured tasks; the assumption that "most persistent direction = unwanted direction" may not hold in structured tasks.
- Correlation dimension estimation is sensitive to window length; the online algorithm still relies on empirical thresholds \(\varepsilon_0,\varepsilon_1\); automatic threshold selection is a potential improvement.
- RMR is currently a "post hoc intervention"; feeding persistent direction detection signals back into training objectives (e.g., adding geometric terms to RLHF rewards) is an obvious next step.
Related Work & Insights¶
- vs Locally Typical Sampling / top-p: They modify probabilities; this work modifies states. Orthogonal and can be combined.
- vs activation steering (Zou 2023 / Turner 2023): Also intervenes on the cache, but RMR's directions come from "temporal persistence" rather than task vectors; the goal is to stabilize dynamics, not control semantics.
- vs existing repetition penalty: Fundamentally avoids the need for "N-gram history window" engineering patches; mechanism is more universal.
Rating¶
- Novelty: ⭐⭐⭐⭐⭐ Redefines mode collapse using dynamical systems/phase transition language, provides computable geometric quantities and corresponding interventions, strong framework sense
- Experimental Thoroughness: ⭐⭐⭐⭐ Comprehensive comparisons across multiple models and decoding protocols, but does not cover reasoning/agent long-range tasks
- Writing Quality: ⭐⭐⭐⭐ Theoretical narrative is clear, minimal model groundwork to real LLM intervention, logical flow is smooth
- Value: ⭐⭐⭐⭐ Provides an almost free new low-entropy decoding regime, minimal deployment friction, significant engineering value