Dual-Flow: Transferable Multi-Target, Instance-Agnostic Attacks via In-the-wild Cascading Flow Optimization¶

Conference: NeurIPS 2025 arXiv: 2502.02096 Code: github.com/Chyxx/Dual-Flow Area: AI Security / Adversarial Attacks Keywords: adversarial attacks, black-box transfer attacks, diffusion models, flow matching, multi-target attacks

TL;DR¶

This paper proposes the Dual-Flow framework, which leverages the forward ODE flow of a pretrained diffusion model and the reverse flow of a fine-tuned LoRA velocity function to perform multi-target, instance-agnostic adversarial attacks. Through a cascading distribution shift training strategy, the method significantly improves transfer attack success rates (e.g., +34.58% from Inc-v3 to Res-152) and demonstrates strong robustness against defended models.

Background & Motivation¶

Background: Adversarial attacks are broadly categorized as instance-specific or instance-agnostic. Instance-agnostic methods learn perturbations at the data distribution level, yielding better black-box transferability. Generative model-based methods are further divided into single-target (requiring one model per target class) and multi-target (a single conditional model attacks all classes).

Limitations of Prior Work: Multi-target generative attacks suffer from low transfer success rates due to limited model capacity; existing diffusion model-based attacks are instance-specific (requiring target model gradients at inference time); the theoretical justification for choosing ODE vs. SDE sampling is lacking.

Key Challenge: During reverse flow training, the true distribution at intermediate timesteps is inaccessible (the forward ODE trajectory is in-the-wild), making standard diffusion training algorithms inapplicable.

Goal: (a) How can diffusion models be leveraged for instance-agnostic multi-target attacks? (b) How can the reverse flow be trained without access to intermediate distributions?

Key Insight: The attack is decomposed into two flows — a forward flow (a pretrained diffusion model generates a perturbed distribution) and a reverse flow (fine-tuned LoRA maps it back to the constrained space).

Core Idea: The forward ODE of a pretrained diffusion model produces intermediate representations, which are then mapped back into \(\ell_\infty\)-constrained adversarial examples via a LoRA-fine-tuned velocity function. Cascading optimization progressively improves attack effectiveness.

Method¶

Overall Architecture¶

An input image \(x\) is mapped to a perturbed distribution \(X_\tau\) via the forward flow, then mapped to the \(\ell_\infty\)-constrained space via the reverse flow. No target model gradients are required at inference time.

Key Designs¶

Forward Flow:
- Function: Maps clean images to an intermediate perturbed distribution.
- Mechanism: Uses the pretrained diffusion model's velocity function \(v_\phi\), integrating via ODE from \(t=0\) to \(t=\tau\).
- Design Motivation: The pretrained diffusion model inherently generates structured perturbed distributions without additional training.
Reverse Flow:
- Function: Maps the perturbed distribution to valid adversarial examples.
- Mechanism: Fine-tunes LoRA to obtain a new velocity function \(v_\theta\), integrating via ODE from \(t=\tau\) to \(t=0\).
- Optimization Objective: Minimize cross-entropy \(j = -\mathrm{CE}(f(x), c)\), where \(f\) is the surrogate model and \(c\) is the target class.
Cascading Distribution Shift Training:
- Function: Resolves the inaccessibility of intermediate timestep distributions during training.
- Mechanism (Algorithm 1): Backtracking from \(t=N\) to \(t=1\), each step first estimates \(\hat{x}_0\), clips it to the constraint range, and updates \(\theta\) via cross-entropy.
- Theoretical Guarantee (Theorem 2): Cascading improvement property — updating \(\theta\) at timestep \(t\) does not worsen the cross-entropy at timestep \(t - \delta\) (for sufficiently small \(\delta\)).
- Design Motivation: Ensures consistency between training and sampling procedures.
Morse Flow Construction (Proposition 1):
- Core Theory: Under mild assumptions on \(X_\epsilon\) and \(j\), there exists a unique smooth flow \(\Phi\) whose velocity function \(v\) equals \(\alpha(x) \cdot \nabla_x j(x)\) almost everywhere.
- Significance: Guarantees that gradient-directed flow improves the attack objective, and that the flow map is a diffeomorphism.
Dynamic Gradient Clipping and ODE vs. SDE Selection:
- Estimated \(\hat{x}_0\) is clipped with stop-gradient during training.
- Cascading ODE outperforms cascading SDE (stochastic terms disrupt the cascade) and stochastic SDE (distribution mismatch), validating the necessity of deterministic trajectories.

Loss & Training¶

Cross-entropy loss \(\mathrm{CE}(f(\hat{x}_0), c)\)
\(\ell_\infty \leq 16/255\) perturbation constraint
LoRA fine-tuning to minimize additional parameter count

Key Experimental Results¶

Main Results: Multi-Target Attack Success Rate (%) — Normally Trained Models¶

Source Model	Method	Inc-v3*	Inc-v4	Res-152	DN-121	VGG-16	Black-box Avg.
Inc-v3	C-GSP	93.40	66.90	41.60	46.40	45.00	51.08
Inc-v3	CGNC	96.03	59.43	42.48	62.98	52.54	52.80
Inc-v3	Dual-Flow	90.08	77.19	77.06	82.64	67.09	73.96

Attack Success Rate (%) on Defended Models — Source Model: Inc-v3¶

Method	Inc-v3_adv	IR-v2_ens	Res50_SIN	Res50_Aug	Avg.
C-GSP	20.41	18.04	6.96	21.95	24.28
CGNC	24.36	22.54	8.85	22.85	28.60
Dual-Flow	51.54	55.62	45.86	67.56	62.28

Key Findings¶

Black-box transfer success rates are substantially improved: Inc-v3 → Res-152 increases from 42.48% (CGNC) to 77.06%, an absolute gain of 34.58%.
The advantage over defended models is even greater: average success rate of 62.28% vs. 28.60% for CGNC (+33.68%).
Compared to single-target attacks, the multi-target variant underperforms by only ~3%, while eliminating the need to train separate models for each target class.
Cascading ODE substantially outperforms cascading SDE and stochastic SDE, confirming the necessity of deterministic trajectories.

Highlights & Insights¶

This is the first work to apply flow-based ODE velocity training to adversarial attacks (as opposed to conventional score function training), opening a new direction for diffusion models in the security domain.
The cascading distribution shift training is elegantly designed — by first integrating forward then optimizing backward step by step, it ensures training-inference consistency with theoretical guarantees.
LoRA fine-tuning enables adversarial adaptation with minimal additional parameters, making deployment practical.

Limitations & Future Work¶

White-box access to the surrogate model is required during training; transferability to target models operates in the black-box setting.
Experiments are conducted solely on ImageNet classification; extension to downstream tasks such as detection and segmentation remains unexplored.
The perturbation constraint is fixed at \(\ell_\infty \leq 16/255\); alternative constraints or smaller perturbation budgets are not explored.
The selection of the forward flow timestep \(\tau\) may require careful tuning.

vs. CGNC (2024): Both are multi-target conditional generative attacks, but CGNC employs a UNet-GAN whereas Dual-Flow uses diffusion ODE + LoRA; Dual-Flow achieves on average 20+% higher black-box transfer rates.
vs. C-GSP: Also a generative method, but achieves lower transfer rates than both CGNC and Dual-Flow.

Rating¶

Novelty: ⭐⭐⭐⭐ First application of flow-based velocity training to multi-target adversarial attacks; the cascading training approach is methodologically innovative.
Experimental Thoroughness: ⭐⭐⭐⭐ Covers normal and defended models, multi- and single-target settings, and ODE vs. SDE comparisons.
Writing Quality: ⭐⭐⭐⭐ Theory and experiments are well integrated, with clear intuitive explanations.
Value: ⭐⭐⭐⭐ Significantly advances the state of the art in multi-target transfer attacks, with important implications for model robustness evaluation.