Skip to content

Generalization of Diffusion Models Arises with a Balanced Representation Space

Conference: ICLR 2026 arXiv: 2512.20963 Area: Image Generation / Diffusion Model Theory

TL;DR

This paper represents a significant advance in the theory of diffusion model generalization. By analyzing the optimal solutions of two-layer nonlinear ReLU denoising autoencoders (DAEs), it provides a unified characterization of both memorization and generalization, and introduces a novel representation-centric perspective on generalization. The theoretical findings are consistently validated on EDM, DiT, and Stable Diffusion v1.4, and give rise to two practical applications: memorization detection and controllable editing. The work achieves both theoretical depth and practical utility.

Rating

⭐⭐⭐⭐⭐

This paper represents a significant advance in the theory of diffusion model generalization. By analyzing the optimal solutions of two-layer nonlinear ReLU DAEs, it provides a unified characterization of both memorization and generalization, and introduces a novel representation-centric perspective. The theoretical findings are consistently validated on EDM, DiT, and Stable Diffusion v1.4, and give rise to two practical applications: memorization detection and controllable editing. The work achieves both theoretical depth and practical utility.


Background & Motivation

State of the Field

Diffusion models have become the dominant class of generative models, with representative systems such as Stable Diffusion, Flux, and Veo achieving unprecedented scalability, controllability, and fidelity through iterative denoising. Recent research has revealed that diffusion models not only learn distributions but also learn meaningful representations, suggesting a deep duality between distribution learning and representation learning.

Limitations of Prior Work

Theoretically, the closed-form solution of the standard training objective (denoising score matching) reduces to mere memorization of training samples; yet in practice, models consistently generate novel and diverse outputs. This substantial gap between theoretical expectation and observed behavior constitutes a central open problem in understanding diffusion models, with direct implications for privacy, interpretability, and trustworthy deployment.

Root Cause

Existing theoretical approaches each carry significant limitations: random feature models oversimplify the architecture; linear model analyses can characterize generalization but fail to capture memorization; manually constructed closed-form solutions simulate specific behaviors but remain phenomenological and fragmented. A unified mathematical framework capable of simultaneously explaining both memorization and generalization is lacking.

Paper Goals

By analyzing the optimal solutions of two-layer nonlinear ReLU denoising autoencoders (DAEs), this work establishes a unified mathematical framework: (i) when data is locally sparse, weights store individual samples, leading to memorization; (ii) when data is locally abundant, weights capture data statistics, enabling generalization. The key innovation lies in a representation-space perspective: memorized samples exhibit spiky representations, while generalized samples exhibit balanced representations.


Mechanism

Goal: ### Overall Architecture

The paper considers a two-layer ReLU DAE \(\boldsymbol{f}_{\boldsymbol{W}_2, \boldsymbol{W}_1}(\boldsymbol{x}) = \boldsymbol{W}_2 [\boldsymbol{W}_1^\top \boldsymbol{x}]_+\) with training objective:

$$\min_{\boldsymbol{W}_2, \boldsymbol{W。

Method

Overall Architecture

The paper considers a two-layer ReLU DAE \(\boldsymbol{f}_{\boldsymbol{W}_2, \boldsymbol{W}_1}(\boldsymbol{x}) = \boldsymbol{W}_2 [\boldsymbol{W}_1^\top \boldsymbol{x}]_+\) with training objective:

\[\min_{\boldsymbol{W}_2, \boldsymbol{W}_1} \frac{1}{n} \sum_{i=1}^{n} \mathbb{E}_{\boldsymbol{\epsilon}} \left[ \| \boldsymbol{f}(\boldsymbol{x}_i + \sigma \boldsymbol{\epsilon}) - \boldsymbol{x}_i \|_2^2 \right] + \lambda \sum_{l=1}^{2} \| \boldsymbol{W}_l \|_F^2\]

The core theorem (Theorem 3.1) establishes that, under \((\alpha, \beta)\)-separability conditions, local minima of the DAE exhibit a block structure, where each block corresponds to a data cluster whose internal structure is determined by the eigendecomposition of the cluster's Gram matrix.

Key Design 1: Memorization — Sample Storage under Overparameterization

Corollary 3.2: When \(p \geq n\) (number of hidden units ≥ number of samples), each training sample is treated as an independent cluster, and the weights directly store the original data points:

\[\boldsymbol{W}_\text{mem} = (r_1 \boldsymbol{x}_1 \cdots r_n \boldsymbol{x}_n \boldsymbol{0} \cdots \boldsymbol{0}), \quad r_i = \sqrt{\frac{\| \boldsymbol{x}_i \|_2^2 - n\lambda}{\| \boldsymbol{x}_i \|_4^4 + \sigma^2 \| \boldsymbol{x}_i \|_2^2}}\]

Representation characteristics: The representation of a memorized sample \(\boldsymbol{x}_i\) approximates a one-hot vector:

\[\boldsymbol{h}_\text{mem}(\boldsymbol{x}_i + \sigma \boldsymbol{\epsilon}) \approx (0, \ldots, 0, r_i \boldsymbol{x}_i^\top(\boldsymbol{x}_i + \sigma \boldsymbol{\epsilon}), 0, \ldots, 0)\]

Since \(\boldsymbol{x}_i\) is negatively correlated with other stored samples, only a single neuron is strongly activated, producing a spiky representation.

Key Design 2: Generalization — Statistical Learning under Underparameterization

Corollary 3.3: When \(p \ll n\), each weight block learns the principal components of the corresponding Gaussian mode:

\[\boldsymbol{W}_{\boldsymbol{X}_k} \boldsymbol{W}_{\boldsymbol{X}_k}^\top \to \left[ (\boldsymbol{S}_k - \frac{\lambda}{\rho_k} \boldsymbol{I})(\boldsymbol{S}_k + \sigma^2 \boldsymbol{I})^{-1} \right]_{\text{rank-}p_k}\]

where \(\boldsymbol{S}_k = \boldsymbol{\mu}_k \boldsymbol{\mu}_k^\top + \boldsymbol{\Sigma}_k\) denotes the second-order statistics of the \(k\)-th mode.

Representation characteristics: The energy of a generalized sample's representation is distributed across \(p_k\) coordinates within the active block, forming a balanced representation — multiple neurons are activated, encoding the distributional statistics.

Key Design 3: Mixed Regime and Practical Applications

Corollary 3.4: When training data contains duplicate samples, the model simultaneously memorizes the repeated subset and generalizes over the non-degenerate subset, with weights exhibiting a mixed structure.

Based on these theoretical findings, the paper proposes two applications: - Memorization detection: The standard deviation of representations serves as a proxy for spikiness — high variance indicates memorization, low variance indicates generalization. - Representation-guided editing: Adding the mean representation of a target style or concept in representation space yields smooth editing for generalized samples, whereas memorized samples exhibit brittle, threshold-like responses.


Key Experimental Results

Main Results: Memorization Detection

Memorization detection performance is evaluated on three dataset–model pairs:

Method Prompt-free LAION AUC↑ LAION TPR↑ ImageNet AUC↑ CIFAR10 AUC↑ Avg. Time↓
Carlini et al. 0.498 0.020 N/A N/A 3.724s
Wen et al. 0.986 0.961 N/A N/A 0.134s
Hintersdorf et al. 0.957 0.500 N/A N/A 0.009s
Ross et al. 0.956 0.915 0.971 0.713 0.545s
Ours 0.987 0.961 0.995 0.998 0.067s

The proposed method is the first to be simultaneously prompt-free and representation-based, achieving the highest AUC across all three datasets while being substantially more efficient than geometry-based approaches.

Ablation Study: Theoretical Validation

Validation Dimension Condition Finding
Memorization weight structure Trained on 5 CelebA images Weight columns store scaled original images, consistent with Corollary 3.2
Generalization weight structure Trained on 10,000 CelebA images Weights capture principal components of data, consistent with Corollary 3.3
Noise robustness \(\sigma = 0.2, 1, 5\) Block structure persists under large noise levels
Optimizer robustness Adam, AdamW, RMSProp Different optimizers converge to the same sparse structure
Jacobian of real models EDM, SD1.4, DiT Memorized samples exhibit extremely low-rank Jacobians; generalized samples reflect data statistics
Representation-guided editing SD1.4 Generalized samples support smooth, progressive editing; memorized samples show brittle threshold responses

Limitations & Future Work

Strengths: - Provides a unified characterization of memorization and generalization under a nonlinear ReLU setting, surpassing prior linear and random feature analyses. - The representation-space perspective is a pioneering contribution, establishing a rigorous correspondence between representation structure and generative behavior. - Theoretical predictions are consistently validated on real-world models including EDM, DiT, and SD1.4. - Gives rise to a practical tool: efficient prompt-free memorization detection (AUROC > 0.98). - Jacobian SVD analysis confirms that the two-layer ReLU DAE serves as an effective local approximation of real models.

Weaknesses: - Theoretical analysis is restricted to two-layer ReLU networks, leaving a substantial gap with practical deep architectures (U-Net, DiT). - The separability assumption (\(\beta < 0\)) may not hold for real high-dimensional data. - The representation-guided editing method is relatively basic and lacks systematic comparison with existing editing approaches. - The Gaussian mixture assumption is a coarse approximation of real data manifolds.

Highlights & Insights

  • The method design is concise and effective, with a clear core mechanism.
  • Experimental validation is comprehensive, with thorough ablation analysis.
  • The work offers new solutions to key open problems in the field.

Limitations & Future Work

  • The method may have limitations under certain conditions; generalizability warrants further investigation.
  • Computational efficiency and scalability can be further optimized.
  • Integration with a broader range of related methods deserves exploration.
  • vs. Representative methods in the field: This paper makes unique methodological contributions that are complementary to existing approaches.
  • vs. Traditional methods: The proposed method achieves significant improvements on key metrics compared to conventional solutions.
  • Insights: The technical approach of this paper provides important reference value for subsequent related work.