On the Interaction of Compressibility and Adversarial Robustness¶

Conference: ICLR 2026
OpenReview: https://openreview.net/forum?id=zSsNzxjWP4
Code: https://github.com/mbarsbey/advcomp
Area: AI Safety / Adversarial Robustness / Model Compression
Keywords: Structured Compression, Adversarial Robustness, Lipschitz Constant, Operator Norm, Low-rank

TL;DR¶

This paper provides a unified theoretical framework proving that "structured compression" (at both neuron and spectral levels) concentrates parameter energy into a few dominant directions. This concentration raises the operator norm and Lipschitz constant of the network, creating "high-sensitivity directions" in the representation space that adversarial attacks can exploit, ultimately leading to a systematic degradation of adversarial robustness. These predictions are validated across various architectures, datasets, and training paradigms.

Background & Motivation¶

Background: Modern neural networks are required to be both "compressible" (via pruning, low-rank decomposition, or quantization to reduce memory and latency) and "robust to adversarial perturbations" in safety-critical scenarios like healthcare or autonomous driving. While both compressibility and adversarial robustness have been studied extensively, a mature and unified understanding of their interaction is lacking.

Limitations of Prior Work: Previous conclusions regarding whether compression helps or hurts robustness are contradictory, especially concerning structured compression (pruning entire rows/filters or overall low-rank parameterization), which holds the highest practical value. Existing mechanistic explanations are fragmented: some find low-rank parameterization inadvertently amplifies local Lipschitz constants, others link adversarial transferability to layer-wise operator norms, and some point out that moderate sparsity helps while excessive sparsity causes brittleness due to ill-conditioning. These clues suggest a subtle relationship depending on the degree of compression, but a universal principled framework is missing.

Key Challenge: Compression is efficient precisely because it concentrates the total parameter energy into a few dominant sub-structures (dominant rows or large singular values). This "energy concentration" creates several extremely strong directions in the representation space. As long as an adversarial attack aligns its perturbation with these directions, the perturbation is significantly amplified, easily flipping the prediction. In other words, making a model more compressible indirectly makes it easier to attack—there is a structural tension between the two.

Goal: (1) Provide an analytical robustness upper bound that chains "compressibility $\to$ operator norm $\to$ Lipschitz constant $\to$ adversarial robustness gap"; (2) Prove that this vulnerability emerges regardless of whether compressibility stems from regularization, architectural bias, or learning dynamics; (3) Verify its persistence under adversarial training and transfer learning, and its role in inducing Universal Adversarial Examples (UAEs).

Key Insight: Instead of studying specific compression "algorithms," the authors study the inherent compressibility of the parameters as a scalar-invariant property. By using a "structure $\times$ scale" decomposition, they link compressibility to the operator norm, making the conclusions universal and independent of the specific compression method.

Core Idea: Characterize the causal chain of "structured compression $\to$ dominant direction concentration $\to$ larger operator norm $\to$ high-sensitivity directions $\to$ decreased robustness" using an analytically decomposable Lipschitz/robustness upper bound.

Method¶

Overall Architecture¶

This work is an analytical study combining theory and large-scale empirical evidence. It does not propose a new model but rather an analytical upper bound from "compressibility" to the "adversarial robustness gap." The logical chain is as follows: first, define $(q,k,\epsilon)$-compressibility to unifiedly describe "neuron compression" (row sparsity) and "spectral compression" (approximate low-rank); second, prove that higher compressibility leads to a larger layer-wise operator norm ($\ell_\infty$ / $\ell_2$) relative to the Frobenius norm (Theorem 3.1, "Structure vs. Scale" decomposition); third, introduce an "inter-layer alignment" term to chain layer-wise norms into a Lipschitz upper bound for the entire encoder (Theorem 3.2); finally, apply existing robustness gap results to derive an upper bound: "Adversarial Risk $\le$ Clean Risk $+ \delta \cdot \hat L_\Phi \cdot \|C\|$" (Corollary 3.3).

The fundamental quantities include: an encoder $\Phi$ mapping input to representation $z=\Phi(x)$, and an attack perturbing a clean sample to $x_{adv}=x+a^*$, where $a^*=\arg\max_{\|a\|_p\le\delta} f(x+a,\theta)$ is the worst-case perturbation within budget $\delta$. The objective is the adversarial robustness gap $\Delta^{adv}_p := F^{adv}_p(\theta;\delta)-F(\theta)$.

Key Designs¶

1. $(q,k,\epsilon)$-Compressibility: A Scalar-Invariant Definition for Row Sparsity and Low-Rankness

To discuss how compression affects robustness, a metric is needed that is independent of parameter scaling and applicable to both pruning and low-rankness. The authors define: for a vector $\theta\in\mathbb{R}^d$ and integer $k\le d$, let $\theta_k$ be the compressed vector retaining only the $k$ elements with the largest magnitudes. $\theta$ is $(q,k,\epsilon)$-compressible if: $$\|\theta-\theta_k\|_q / \|\theta\|_q \le \epsilon,$$ representing the relative residual after truncation. A dispersion variable $\beta\in[0,1]$ characterizes the internal variance of the top-$k$ terms. This definition is flexible: if $\theta$ represents the vector of $\ell_1$ norms of rows in $W$, it measures neuron/row compression; if $\theta$ represents singular values $\sigma$, it measures spectral compression.

2. Structure vs. Scale Decomposition: Translating Compressibility to Operator Norms (Theorem 3.1)

Since $(q,k,\epsilon)$-compressibility is scalar-invariant, directly linking it to Lipschitz constants would be masked by arbitrary parameter scaling. The authors decompose the $\ell_\infty$ and $\ell_2$ operator norm bounds into "(compressibility term) $\times$ Frobenius norm," separating "structure" from "scale." Specifically, neuron compression yields: $$\|W\|_\infty \le \frac{(1-\epsilon_\nu)}{(1-\beta_\nu)}\Big(\sqrt{hk_r}+\tfrac{h\epsilon_r}{k_\nu}\Big)\|W\|_F,$$ and spectral compression yields: $$\|W\|_2 \le \frac{(1-\epsilon_\sigma)}{(1-\beta_\sigma)}\Big(\tfrac{\sqrt{h}}{k_\sigma}\Big)\|W\|_F.$$ Intuitively, matrices dominated by a few rows (neuron compression, small $\epsilon_\nu, k_\nu$) increase the $\ell_\infty$ norm, and higher dispersion within those rows (large $\beta_\nu$) exacerbates this. This formulates "energy concentration $\to$ norm expansion" in an analytical way.

3. Inter-layer Alignment + Encoder Lipschitz Bound $\to$ Robustness Gap (Theorem 3.2 + Corollary 3.3)

Chaining layer-wise norms usually relies on "product of norms" $\|W^{l+1}\|\|W^l\|$, which is overly pessimistic as it assumes perfect alignment of the strongest directions across layers. The authors introduce an inter-layer alignment term $A_p(l)$ ($p\in\{2,\infty\}$) to correct this by measuring the actual alignment of top-$k$ substructures through ReLU nonlinearities (represented by binary diagonal matrices $D$): $$A_\infty(l)\triangleq \max_{D\in\mathcal D}\frac{\|W^{l+1}_k D W^l_k\|_\infty}{\|W^{l+1}\|_\infty\|W^l\|_\infty}+R_\infty.$$ This leads to the Lipschitz upper bounds $\hat L^\infty_\Phi$ and $\hat L^2_\Phi$ for encoder $\Phi$ (Theorem 3.2). Applying this to a binary logistic loss yields the robustness gap upper bound (Corollary 3.3): $$F^{adv}_\infty(\theta;\delta)\le F(\theta)+\delta\,\hat L^\infty_\Phi\,\|C\|_1,\qquad F^{adv}_2(\theta;\delta)\le F(\theta)+\delta\,\hat L^2_\Phi\,\|C\|_2.$$ This bound is highly correlated with empirical robustness gaps (e.g., $\rho=0.947$ for spectral rank variations) and identifies compressibility, scale, and alignment as the core components of vulnerability.

4. Theory-Inspired "Robust Compression" Interventions: $\epsilon$-Pruning, $\beta$-Control, and Alignment Regularization

The theory suggests actionable mitigations. First, instead of fixed pruning ratios (which cause layer collapse), layers are pruned by setting a target $\epsilon$ and solving for $k$. Second, controlling the dispersion $\beta$ of dominant terms (regularizing the variance of the top 5% filter norms) reduces vulnerability without decreasing compressibility. Third, inter-layer alignment is used as a regularization target. These methods improve the retention of standard and robust accuracy during pruning.

Mechanism Example¶

Consider spectral compression in a single-layer network $g(x)=C\phi(Wx)$. Under high compression, the first singular value is much larger than the others ($\sigma_1\gg\sigma_{j\ne1}$). If an adversarial perturbation $a$ aligns with the corresponding right singular vector $v_1$, its representation after the layer is amplified by approximately $\sigma_1$. This allows the perturbation to overwhelm the clean representation in the representation space and flip the prediction. PCA visualizations confirm that in compressible models, perturbations with the same budget are significantly amplified in the representation space compared to baseline models.

Key Experimental Results¶

Main Results¶

Experiments cover MNIST, CIFAR-10, CIFAR-100, SVHN, Flickr30k, and ImageNet-1k datasets using architectures like FCN, ResNet18, VGG16, WideResNet-101-2, ViT, CLIP, and Swin. Robustness is evaluated using AutoPGD and PGD adversarial training. The conclusion is consistent: as compressibility increases, Adversarial Robustness (RA) decreases.

Experimental Setting	Compression Source	Observed Phenomenon
FCN @ MNIST	Nuclear Norm Regularization (NNR)	Compression↑ $\to$ Prunability↑, but RA drops sharply; perturbations align more with principal singular directions
FCN @ CIFAR-10	Group Lasso / Low-rank Decomposition	Both RA and the RA/SA ratio decrease as compression level rises
ResNet18 @ CIFAR-10	Scale-invariant row sparsity regularizer	Same trend observed, extending the finding to CNNs
ViT @ CIFAR-10; CLIP ImageNet Zeroshot	Sparsity-regularized fine-tuning	Fine-tuning common backbones towards sparsity alone creates vulnerability
Bound vs. Empirical Gap	2-layer FCN spectral rank	Theoretical upper bound correlates with empirical robustness gap at $\rho=0.947$

Ablation Study¶

Configuration	Key Finding	Description
Compression under Adv. Training	Trends match standard training	Adversarial training raises overall RA, but the relative negative effect of compression persists
Compression vs. Frobenius Norm	Only compression induces UAEs	Compression creates global vulnerability directions; simply scaling the Frobenius norm does not produce Universal Adversarial Examples
Transfer Learning (CIFAR-100 $\to$ CIFAR-10)	Pre-training compressibility hurts downstream	Vulnerability is embedded in the representation structure and transfers with the encoder
$\epsilon$-Pruning + $\beta$-Control	Improved accuracy retention	Theory-driven interventions successfully mitigate some robustness loss

Key Findings¶

Mechanism Confirmed: When compression increases, adversarial perturbations align more closely with dominant singular directions ($v_i^\top a^*\gg v_j^\top a^*$), and the ratio of adversarial to clean representation magnitude ($\|z_{adv}-z\|_2/\|z\|_2$) increases.
Source-Invariant: Vulnerability emerges whether compression comes from regularization, low-rank parameterization, or learning dynamics, and it persists across adversarial training and transfer learning.
Compression $\neq$ Norm: UAEs are induced by compression, not by mere scale amplification, indicating that UAEs stem from the global vulnerability directions created by compression.

Highlights & Insights¶

The "Structure vs. Scale" decomposition is critical: since compressibility is scalar-invariant, it must be separated from the Frobenius norm to analyze its effect on the Lipschitz constant.
The inter-layer alignment term brings the pessimistic "product of norms" bound closer to reality by considering the actual alignment of top-$k$ structures through ReLU.
The same mechanism that enables generalization and compressibility (energy concentration) is also the structural weakness exploited by adversarial attacks—efficiency and safety are in natural conflict at a structural level.
Practical Advice: Avoid extreme structured compression. Instead, use $\epsilon$-based pruning, constrain dispersion $\beta$, and combine moderate structured compression with other methods like quantization or distillation.

Limitations & Future Work¶

The core theorems are derived for idealized settings (FCN, binary logistic loss, uniform layer-wise compression). While experiments extend to CNNs/ViTs, theoretical rigor for general cases is still evolving.
The analysis uses the global Lipschitz bound. Local Lipschitz constants are tighter, but the global bound is better suited for analyzing global vulnerabilities like UAEs.
Interventions only improve accuracy retention rather than curing the vulnerability, as energy concentration cannot be fully neutralized. Future work should investigate "compression-quantization-distillation" Pareto fronts.

vs. Savostianova et al. (2023) / Feng et al. (2025): While they focus on how specific methods (low-rank, sparsity) affect local Lipschitz properties, this paper uses the $(q,k,\epsilon)$ framework to unifiedly characterize both neuron and spectral compression regardless of their source.
vs. Arora et al. (2018): Classical work views compressibility as a friend of generalization. This paper reveals the dark side: the same energy concentration that promotes generalization also introduces structural vulnerabilities.

Rating¶

Novelty: ⭐⭐⭐⭐⭐
Experimental Thoroughness: ⭐⭐⭐⭐⭐
Writing Quality: ⭐⭐⭐⭐
Value: ⭐⭐⭐⭐⭐

Experimental Setting	Compression Source	Observed Phenomenon
FCN @ MNIST	Nuclear Norm Regularization (NNR)	Compression↑ \(\to\) Prunability↑, but RA drops sharply; perturbations align more with principal singular directions
FCN @ CIFAR-10	Group Lasso / Low-rank Decomposition	Both RA and the RA/SA ratio decrease as compression level rises
ResNet18 @ CIFAR-10	Scale-invariant row sparsity regularizer	Same trend observed, extending the finding to CNNs
ViT @ CIFAR-10; CLIP ImageNet Zeroshot	Sparsity-regularized fine-tuning	Fine-tuning common backbones towards sparsity alone creates vulnerability
Bound vs. Empirical Gap	2-layer FCN spectral rank	Theoretical upper bound correlates with empirical robustness gap at \(\rho=0.947\)