Skip to content

Data-free Universal Adversarial Perturbation with Pseudo-Semantic Prior

Conference: CVPR 2025
arXiv: 2502.21048
Code: https://github.com/ChnanChan/PSP-UAP
Area: AI Safety
Keywords: UAP, adversarial perturbation, data-free, pseudo-semantic, transferability

TL;DR

This paper proposes PSP-UAP, a data-free generation method for universal adversarial perturbations. By extracting pseudo-semantic priors from the UAP itself, utilizing input transformation enhancement, and applying a sample reweighting strategy, it achieves an average white-box fooling rate of 89.95% and significantly outperforms existing methods in black-box scenarios without requiring any training data.

Background & Motivation

Background: Universal Adversarial Perturbation (UAP) is an image-agnostic perturbation that can attack any input image. Traditional UAP methods require large amounts of training data to optimize perturbations, but this data dependency limits their applicability in real-world attack scenarios.

Limitations of Prior Work: (1) Data-driven UAP methods (e.g., UAP, SGA-UAP) require complete training datasets, which may be unavailable due to privacy and copyright restrictions. (2) Data-free UAP methods (e.g., AT-UAP-U, TRM-UAP) typically optimize from random noise, and the lack of semantic information leads to limited attack efficacy. (3) Poor black-box transferability.

Key Challenge: The data-free setting lacks semantic information of real images to guide UAP optimization, yet the UAP itself gradually accumulates semantic information from the target model during the training process.

Goal: How to generate highly effective universal adversarial perturbations under completely data-free conditions?

Key Insight: Key observation—the UAP itself contains rich semantic information during the optimization process (different regions correspond to different semantic patterns). Thus, the UAP can be cropped and scaled to serve as "pseudo-semantic samples" to guide its own further optimization.

Core Idea: Extract pseudo-semantic priors from the UAP itself to serve as training samples, and combine this with input transformations (rotation/scaling/shuffling) and Kullback-Leibler (KL) divergence-guided sample reweighting to generate highly effective UAPs without any training data.

Method

Overall Architecture

Starting from a randomly initialized UAP \(\delta\), optimization is performed iteratively. At each step, \(N\) pseudo-semantic samples are generated by randomly cropping and scaling from the current UAP. Input transformation enhancement is applied to each sample to improve robustness. The method focuses on hard samples through sample reweighting and updates the UAP by maximizing the feature activations of the target model.

Key Designs

  1. Pseudo-Semantic Prior (PSP)

    • Function: Extract pseudo-semantic samples from the UAP itself to serve as training data.
    • Mechanism: \(p_x = \{z + \delta_t | z \in p_z\}\), where \(z\) is random noise. Crop and scale from the UAP to generate \(x_n = C(x; N)\) with \(N=10\) samples.
    • Design Motivation: The UAP contains the semantic structure of the target model—different regions trigger activations of different feature maps. The cropped local regions each carry different semantic fragments.
  2. Input Transformations

    • Function: Enhance the robustness and transferability of the UAP.
    • Mechanism: A combination of three transformations—rotation \(\alpha \in [-6°, 6°]\), scaling \(\beta \in [0.8, 4.0]\), and \(2\times2\) block random shuffling.
    • Design Motivation: Transformations enhance the input diversity during optimization, making the UAP independent of specific spatial configurations.
  3. Sample Reweighting

    • Function: Assign greater weights to hard samples (samples where the model response is less pronounced).
    • Mechanism: \(w_n = D_{KL}(P(x_n) || Q(x_n))^{-1}\), where \(P\) is the clean prediction and \(Q\) is the prediction after perturbation. Samples with smaller KL divergence are harder to attack and are assigned larger weights.
    • Design Motivation: Different pseudo-semantic samples have different effects on the model. Prioritizing the optimization of samples with poorer attack outcomes can improve overall performance.

Loss & Training

  • Total loss: \(L = -\mathbb{E}[\sum_{n=1}^N \sum_{i=1}^l \log(w_n ||\mathcal{A}_i^f(T(x_n + \delta_t))||_2)]\)
  • Perturbation constraint: \(\epsilon = 10/255\) (\(\ell_\infty\) norm)
  • Max iterations: \(T = 10,000\)
  • Saturation threshold: \(0.001\%\)

Key Experimental Results

Main Results

White-box attack fooling rate (%):

Model AT-UAP-U TRM-UAP PSP-UAP
VGG16 94.50 94.30 96.26
VGG19 92.85 91.35 94.65
ResNet152 73.15 67.46 85.65
GoogleNet 82.60 85.32 81.43
Average 87.95 86.39 89.95

Extended model black-box average attack rate:

Model TRM-UAP PSP-UAP
ResNet50 56.57% 64.13%
DenseNet121 42.91% 59.95%
MobileNet-v3 45.76% 61.42%
Inception-v3 59.96% 62.67%

Ablation Study

Component Effect
PSP alone Outperforms random priors
+ Sample Reweighting Significant improvement
+ Input Transformation Further improvement
Full method Best performance

vs Data-driven method (VGG16 black-box average): PSP-UAP 75.65% vs SGA-UAP 69.27%

Key Findings

  • Data-free method outperforms data-driven method: PSP-UAP (data-free) outperforms SGA-UAP (data-driven) under the black-box setting, indicating that pseudo-semantic priors can be more effective than real data in certain scenarios.
  • The largest improvement is observed on ResNet152 (67.46% \(\rightarrow\) 85.65%), which might be due to the residual structure of ResNet being more sensitive to semantic perturbations.
  • The optimal value for the temperature parameter \(\tau\) varies across different models (1.0-10.0), requiring targeted adjustments.
  • The combination of three input transformations performs better than using them individually, but with diminishing marginal returns.

Highlights & Insights

  • UAP itself is a data source: The bootstrapping approach of extracting training signals from the perturbation itself is highly ingenious, fundamentally avoiding data dependency.
  • Data-free method outperforming data-driven methods: This breaks the intuition that "data-driven is always better."
  • KL divergence-guided reweighting: The strategy of focusing attention on hard samples is also transferrable to other optimization problems.
  • The concept of pseudo-semantic priors can be transferred to other scenarios that require data-free optimization (e.g., data-free distillation).

Limitations & Future Work

  • Performance on GoogleNet is inferior to TRM-UAP, likely because the multi-scale structure of the Inception module is less sensitive to pseudo-semantic samples.
  • Not validated on modern architectures such as Vision Transformers.
  • The perturbation constraint \(\epsilon=10/255\) is relatively large, and performance under stricter constraints remains unknown.
  • The computational overhead of 10,000 iterations needs to be evaluated.
  • vs AT-UAP-U: Both are data-free methods, but AT-UAP-U starts with uniform noise and lacks semantic guidance.
  • vs SGA-UAP: The data-driven method is slightly stronger in the white-box setting but has inferior black-box transferability compared to PSP-UAP.
  • vs TRM-UAP: Uses text-guided semantics but is limited by the semantic space of CLIP.

Rating

  • Novelty: ⭐⭐⭐⭐ The bootstrapping idea of extracting semantic priors from the UAP itself is highly novel.
  • Experimental Thoroughness: ⭐⭐⭐⭐ Comprehensive evaluation covering white-box, black-box, extended models, and comparison with data-driven methods.
  • Writing Quality: ⭐⭐⭐⭐ The motivation is clearly presented.
  • Value: ⭐⭐⭐⭐ Data-free UAP is highly meaningful for practical attack scenarios.