NeurIPS 2025 AI Safety low-rank approximation spectral norm perturbation bounds differentially private PCA contour bootstrapping matrix perturbation theory

Spectral Perturbation Bounds for Low-Rank Approximation with Applications to Privacy¶

Conference: NeurIPS 2025 arXiv: 2510.25670 Code: None Area: AI Safety / Differential Privacy Keywords: low-rank approximation, spectral norm perturbation bounds, differentially private PCA, contour bootstrapping, matrix perturbation theory

TL;DR¶

This paper establishes novel high-probability perturbation bounds for low-rank approximation of symmetric matrices under the spectral norm, improving upon the classical Eckart–Young–Mirsky theorem, and resolves an open problem in differentially private PCA.

Background & Motivation¶

Low-rank approximation is a foundational technique in machine learning and data science. When a matrix \(A\) is perturbed by noise \(E\), the central question is to quantify the effect of the perturbation on the top-\(p\) approximation \(A_p\), i.e., how large is \(\|\tilde{A}_p - A_p\|\).

Limitations of existing metrics: - Frobenius norm: may overestimate the noise effect by a factor of \(\sqrt{p}\) - Reconstruction error: may severely underestimate subspace deviation — it can remain small even when the top-\(p\) eigenspace undergoes a large rotation - Classical EYM bound: \(\|\tilde{A}_p - A_p\| \leq 2(\lambda_{p+1} + \|E\|)\), which is overly pessimistic

The spectral norm captures worst-case directional error and provides the strongest practical guarantee for downstream tasks such as PCA and clustering. Mangoubi and Vishnoi posed an open problem in Remark 5.3 of their NeurIPS paper: can high-probability spectral norm bounds on \(\|\tilde{A}_p - A_p\|\) be obtained under natural conditions?

Method¶

Overall Architecture¶

The paper represents the top-\(p\) eigenspace projector via contour integrals from complex analysis, and derives tighter perturbation bounds through a technique termed contour bootstrapping.

Key Designs¶

Main spectral norm bound (Theorem 2.1): Under the eigengap condition \(\delta_p := \lambda_p - \lambda_{p+1} \geq 4\|E\|\), the paper proves \(\|\tilde{A}_p - A_p\| \leq O(\|E\| \cdot \lambda_p / \delta_p)\). This improves over the EYM bound by a factor of \(\min\{\lambda_{p+1}/\|E\|, \delta_p/\|E\|\}\). For example, when the spectrum is \(\{10n, 9n, \ldots, n, n/2, 1, \ldots, 1\}\), EYM yields \(O(n)\) whereas this paper yields \(O(\sqrt{n})\).
Interaction-aware refined bound (Theorem 2.2): By introducing a half-decay distance \(r\) and a noise–eigenspace interaction term \(x = \max_{i,j \leq r} |u_i^\top E u_j|\), the paper obtains \(\|\tilde{A}_p - A_p\| \leq \tilde{O}(\|E\| + r^2 x \cdot \lambda_p / \delta_p)\). When the matrix has low stable rank or the noise is approximately orthogonal to the principal eigenspace, this yields a further improvement of up to \(\sqrt{n}\).
Generalization to spectral functions (Theorem 2.3): The results are extended to general spectral functions \(f(A)\), including polynomials, exponentials, and trigonometric functions, via a natural generalization of contour integration. This covers \(f(z) = z^k, \exp(z), \cos(z)\), and related cases.

Loss & Training¶

This is a theoretical paper. The core technical contribution is contour bootstrapping (Lemma 3.1): - The rank-\(p\) projector is represented via contour integrals in the complex plane - Iterative bootstrapping steps progressively tighten the bound - This avoids classical power-series expansions or Davis–Kahan-style arguments - The technique generalizes the approach of tran2025davis for handling eigenspace perturbations

Key Experimental Results¶

Main Results¶

Method	Norm Type	Noise Model	Probabilistic Guarantee	Extra Factor vs. \(\\|E\\|\)
Classical EYM	Spectral	Arbitrary	High-probability	\(\lambda_{p+1}/\\|E\\|\)
Mangoubi–Vishnoi	Frobenius	Complex Gaussian	In expectation	\(\sqrt{p} \cdot \lambda_p/\delta_p\)
Ours, Thm 2.1	Spectral	Arbitrary symmetric	High-probability	\(\lambda_p/\delta_p\)
Ours, Thm 2.2	Spectral	Arbitrary symmetric	High-probability	\(r^2 x \cdot \lambda_p/\delta_p\)

Ablation Study¶

Configuration	Metric	Notes
Gaussian noise	Spectral error	Predicted values closely track actual error
Rademacher noise	Spectral error	Equally accurate, validating non-Gaussian applicability
Varying dimension \(n\)	Relative error	Improvement becomes more pronounced as \(n\) grows

Key Findings¶

The proposed bounds closely match actual spectral errors on real covariance matrices
Improvements over the classical EYM baseline are consistent across datasets and noise regimes
Corollary 2.4 provides the first high-probability spectral norm guarantee for differentially private PCA

Highlights & Insights¶

Resolves an open problem in the literature (Mangoubi–Vishnoi, Remark 5.3)
Contour bootstrapping is a novel analytical technique that brings complex analysis into matrix perturbation theory
Results apply to both real- and complex-valued Wigner noise, generalizing beyond the complex Gaussian setting of prior work
Elimination of the \(\sqrt{p}\) factor has practical significance for high-dimensional differentially private applications

Limitations & Future Work¶

The eigengap condition \(4\|E\| \leq \delta_p\), while mild, remains a necessary assumption
Theorem 2.3 does not apply to non-entire functions such as \(f(z) = z^c\) for non-integer \(c\), due to singularity issues
Specific privacy mechanisms such as quantized Gaussian or Laplace noise are not analyzed
Constants are not optimized (bound \(<7\)) and could be tightened in future work

Relation to Davis–Kahan: This paper offers a complementary perspective — Davis–Kahan controls subspace angles, whereas this work controls matrix approximation error
Implications for differential privacy: The spectral norm is a more appropriate utility metric than the Frobenius norm or reconstruction error
Potential applications of contour bootstrapping: The technique may extend to perturbation analysis of a broader class of matrix functions

Rating¶

Novelty: ⭐⭐⭐⭐⭐ Resolves an open problem and introduces a fundamentally new analytical technique
Experimental Thoroughness: ⭐⭐⭐ Empirical validation is present but limited in scale (primarily a theoretical contribution)
Writing Quality: ⭐⭐⭐⭐ Clear structure, precise theorem statements, and thorough comparisons
Value: ⭐⭐⭐⭐ Significant contributions to both matrix perturbation theory and differential privacy

Supplementary Details¶

The eigengap condition \(4\|E\| \leq \delta_p\) holds in several classical models: spiked covariance, stochastic block model, and kernel matrices
Empirical analysis (citing DBM NeurIPS Section B) shows that the 1990 Census and UCI Adult datasets satisfy the eigengap condition
Corollary 2.4 further derives high-probability Frobenius norm and reconstruction error bounds
Compared to Mangoubi–Vishnoi, the paper also eliminates the \(\log^{\log\log n} n\) factor and the restrictive assumption \(\lambda_1 \leq n^{50}\)