Weak Diffusion Priors Can Still Achieve Strong Inverse-Problem Performance¶

Conference: ICML 2026 Spotlight
arXiv: 2601.22443
Code: TBD
Area: Image Generation / Diffusion Models / Inverse Problem Solving
Keywords: Diffusion Priors, Inverse Problems, Weak Priors, Bayesian Inference, Latent Noise Optimization

TL;DR¶

The paper discovers that diffusion model priors with low fidelity or domain mismatch can still achieve robust performance in information-rich inverse problems. This seemingly contradictory phenomenon is explained through Bayesian consistency theory and local correlation analysis, providing explicit conditions for when weak priors remain effective.

Background & Motivation¶

Background: Diffusion models are widely used as priors for solving inverse problems due to their powerful generative capabilities. Standard practice involves using "full-strength" high-fidelity diffusion models (e.g., 1000-step DDPM) where the training data matches the target task.

Limitations of Prior Work: Ideally suited priors are often unavailable in practical applications. Memory constraints force researchers to use DDIM samplers with only 3-4 steps, and data-scarce fields like medical imaging cannot train domain-specific models. Theoretically, these "weak priors" should limit reconstruction quality.

Key Challenge: Experimentally, weak priors often perform comparably to or even better than full-strength priors. For instance, Wang et al. achieved a 22-66 dB PSNR gain using a 3-step DDIM for inverse problems, and Jalal et al. reconstructed knee MRIs using a single-mode brain MRI model. Currently, these successes are mostly anecdotal and lack systematic theoretical explanation.

Goal: To answer two questions: (1) Under what conditions are inverse problems robust to the choice of prior? (2) Are weak priors truly as "weak" as their sample quality suggests?

Key Insight: In high-dimensional measurement settings, data information can outweigh prior constraints. While weak priors exhibit poor sample quality, they retain local spatial structures similar to strong priors.

Core Idea: Characterize posterior concentration when measurement information is abundant using Bayesian consistency theory; use local correlation diagnostics to prove that weak and strong priors share similar local statistical structures.

Method¶

Overall Architecture¶

The paper solves inverse problems under the measurement model \(y = \mathcal{A}(x) + \epsilon\) using a latent noise optimization approach. It treats the generative model \(G\) as a black box without modifying its internal sampling chain. Instead, it directly searches for a noise vector \(z\) in the latent space that minimizes \(\|\mathcal{A}(G(z)) - y\|_2^2\). This bypasses backpropagation through hundreds of denoising steps, allowing even extremely weak 3-step generators to serve as the experimental vehicle for verifying that "weak priors are sufficient."

Key Designs¶

1. AdamSphere Optimizer: Constraining Noise to the Gaussian Hypersphere

Standard Adam allows \(z\) to deviate freely from the origin. However, inputs seen by diffusion models during training \(z \sim \mathcal{N}(0, I_d)\) have their probability mass concentrated almost entirely on a thin spherical shell (typical set) of radius \(\|z\| \approx \sqrt{d}\) in high dimensions. If optimization pushes \(z\) away from this shell, the generator receives out-of-distribution inputs, leading to collapsed sample quality. AdamSphere projects \(z\) back to the \(\|z\| = \sqrt{d}\) sphere after each update, strictly limiting the search to the valid manifold "recognized" by the generator, thus stabilizing sample quality while maintaining optimization degrees of freedom.

2. HoldoutTopK Early Stopping: Preventing Overfitting with Withheld Measurements

Latent noise optimization can overfit noise in corrupted observations over many iterations, carving artifacts into reconstructions. HoldoutTopK sets aside a subset of measurements as a validation set to track loss. Unlike standard machine learning which takes a single best point, it saves the most recent entry among the Top-K validation losses. When \(K>1\), this smooths out noise fluctuations in single-point validation loss, preventing the optimization from being misled by accidental local minima. This strategy yields approximately 3–5% PSNR improvement.

3. Bayesian Posterior Consistency Theory: Explaining Why Weak Priors Aren't Weak

This theoretical core answers "under what conditions the posterior concentrates on the true signal consistent with measurements, rendering prior differences negligible." The paper models the generative prior as a Gaussian Mixture \(\pi(x)=\sum_{j=1}^M w_j \varphi(x;\mu_j,\tau^2 I_n)\). Theorem 3.2 proves that when the measurement dimension \(m\) is sufficiently large and there is a positive score gap \(\delta_0>0\) between the best-matching component and others, the posterior concentrates on the optimal component at an exponential rate of \(CM\exp(-\delta_0 m)\). Intuitively, high-dimensional measurement information overwhelms the prior. In practice, the score gap in 70% of inpainting tasks was measured between 0.22–0.28 (significantly \(>0\)), aligning with theoretical predictions.

Key Experimental Results¶

Main Results: Cross-Domain Inverse Problem Solving¶

Task	Method	Prior Domain	CelebA PSNR	Bedroom PSNR	Church PSNR
Inpainting	DPS	CelebA	31.98	27.97	24.15
Inpainting	Ours	CelebA (3 steps)	33.78	27.78	23.56
Inpainting	Ours	Bedroom (3 steps)	32.76	28.88	24.22
Inpainting	Ours	Church (3 steps)	32.62	28.66	24.93
Super-Res	DPS	CelebA	26.82	22.95	20.28
Super-Res	Ours	CelebA (3 steps)	31.27	25.88	22.68
Super-Res	Ours	Bedroom (3 steps)	30.34	26.59	22.86

Even in extreme cross-domain scenarios (e.g., bedroom prior reconstructing faces), the proposed method outperforms DPS by 1-4 dB.

Local Correlation Analysis¶

Pixel Distance	CelebA 3-step	CelebA 20-step	Bedroom 3-step	Bedroom 20-step
0	1.0000	1.0000	1.0000	1.0000
1	0.9558	0.9814	0.9645	0.9615
4	0.8866	0.9100	0.8786	0.8573
8	0.7767	0.8108	0.7637	0.7437
16	0.5595	0.6261	0.5632	0.5618

Regardless of generation steps or training domain, spatial autocorrelation decay remains similar, confirming the hypothesis of shared local structures.

Key Findings¶

Bayesian consistency combined with local correlation explains the efficacy of weak priors.
Failure Modes: Large box inpainting or 16× super-resolution, where missing regions are too large \(\rightarrow\) posterior fails to concentrate \(\rightarrow\) weak prior performance degrades.
The combination of AdamSphere and HoldoutTopK enhances optimization stability.

Highlights & Insights¶

Deep Integration of Theory and Empiricism: Evolves the question from "why weak priors sometimes work" to "under what quantitative conditions they work."
Clever Local Correlation Diagnostics: Spatial autocorrelation curves provide indirect proof that weak priors are not as "weak" as their visual samples suggest.
Precise Characterization of Failure Modes: Theoretical predictions (small \(m\) \(\rightarrow\) no posterior concentration \(\rightarrow\) strengthened prior dependence) match experimental results perfectly.

Limitations & Future Work¶

Weak priors degrade significantly under large-area missing data or extreme super-resolution factors.
Higher tightness of the Gaussian Mixture Prior assumption in actual diffusion models requires deeper analysis.
Generalization on real-world medical data needs further verification.
Future Improvements: Hybrid methods (weak priors + parameter-efficient fine-tuning); studying the collapse of posterior concentration conditions; adaptive early stopping.

vs DPS: DPS requires traversing the diffusion chain to inject measurement information at each step; the latent optimization in this paper competes using only a 3-step generator.
vs General Theory of Generative Priors: First systematic characterization of posterior concentration phenomena within an inverse problem framework.
vs Medical Imaging Applications: Provides a scientific basis for the practice of "no data = use general priors."

Rating¶

Novelty: ⭐⭐⭐⭐⭐ First to use Bayesian posterior consistency theory to explain weak prior efficacy.
Experimental Thoroughness: ⭐⭐⭐⭐⭐ 4 types of inverse problems + 3 datasets + multiple prior strengths + failure mode analysis.
Writing Quality: ⭐⭐⭐⭐⭐ Clear logical hierarchy (phenomenon \(\rightarrow\) theory \(\rightarrow\) diagnostics \(\rightarrow\) application).
Value: ⭐⭐⭐⭐⭐ Offers both deep theoretical contributions and practical guidance.