Revisiting Adversarial Patch Defenses on Object Detectors: Unified Evaluation, Large-Scale Dataset, and New Insights¶
Conference: ICCV 2025 arXiv: 2508.00649 Code: https://github.com/Gandolfczjh/APDE Area: Adversarial Robustness / Object Detection Keywords: Adversarial Patch Defense, Object Detection, Benchmark Evaluation, Large-Scale Dataset, Adaptive Attack
TL;DR¶
This paper systematically revisits 11 adversarial patch defense methods, establishes the first patch defense benchmark covering 13 attacks, 11 detectors, and 4 metrics, constructs a large-scale APDE dataset of 94,000 images, and reveals three key insights: the difficulty of defending against natural adversarial patches stems from data distribution rather than high-frequency components; patch detection accuracy is inconsistent with defense performance; and adaptive attacks can circumvent most existing defenses.
Background & Motivation¶
Adversarial patch attacks pose a significant security threat to DNNs in the physical world, particularly in scenarios such as pedestrian detection and autonomous driving. Numerous defense methods have been proposed in recent years; however, existing evaluations suffer from four major issues:
Lack of a unified framework: Different papers employ inconsistent hyperparameters, attack methods, and patch placement strategies, preventing fair comparison.
Inappropriate metrics: Some works evaluate defense performance solely via patch detection accuracy, yet high detection accuracy does not imply effective defense.
Incomplete analysis: Key factors such as inference latency, the impact of varying patch sizes and types, and physical-world applicability are often neglected.
Insufficient attacks: Existing patch datasets are small-scale and lack coverage categorized by detector.
Method¶
Overall Architecture¶
A complete evaluation pipeline is established: attack generation → dataset construction → unified evaluation → comprehensive analysis. The core contributions are the APDE dataset and the benchmark evaluation framework.
Key Designs¶
-
APDE Dataset Construction:
- White-box attacks are performed using 13 attack methods against 11 detectors, generating 94 distinct patches.
- Patches are applied to the INRIA-Person and MS COCO test sets, yielding 94,000 images in total.
- The dataset is split into 56,400 training images and 37,600 test images (6:4 ratio).
- Compared to existing datasets: Apricot (60 patch types, 1,011 images) and GAP (25 types, 9,266 images).
- Advantages: large scale, diverse patch distribution, and white-box setting (worst-case evaluation).
-
Evaluation Metric System:
- AP@0.5: Average precision on attacked targets, directly reflecting defense effectiveness.
- ASR (Attack Success Rate): Measures the residual effect of the attack.
- mIoU (SmIoU / NmIoU): A substitute for patch AP@0.5, applicable to irregularly shaped patches.
- Inference Time: Evaluates computational practicality.
- Core finding: AP on attacked targets reflects true defense capability more faithfully than patch detection accuracy.
-
Defense Method Taxonomy and Evaluation:
- Patch detection/segmentation-based: SAC, PAD, Adyolo, NAPGuard.
- Patch prior knowledge-based: LGS, Zmask, Jedi.
- Generative model-based: DIFFender, NutNet.
- Certified defenses: DetectorGuard, ObjectSeeker.
- Both hiding attacks and appearance attacks are covered.
Loss & Training¶
Adversarial patch generation follows the general objective: \(\delta^* = \arg\min_\delta \mathbb{E}_{x \sim X}[\mathcal{L}(f_i(\mathcal{A}(x, \delta, t)), y)] + \lambda L_{tv}(\delta)\), where \(L_{tv}\) denotes total variation loss to promote smoother patches.
Key Experimental Results¶
Main Results¶
Defense performance against hiding attacks across 11 detectors (Person AP@0.5):
| Defense Method | Type | Overall Mean | Overall Min | Inference Time (ms) |
|---|---|---|---|---|
| w/o defense | - | 30.74 | - | - |
| SAC | Patch Segmentation | 60.88 | 20.62 | 44 |
| PAD | Patch Segmentation | 76.12 | 40.99 | 32,100 |
| Adyolo | Patch Detection | 63.12 | 26.59 | 62 |
| NAPGuard | Patch Detection | 75.94 | 46.93 | 59 |
| DIFFender | Generative Model | 56.23 | 11.98 | 1,240 |
| NutNet | Generative Model | 76.53 | 55.79 | 71 |
| LGS | Prior Knowledge | 71.58 | 29.55 | 82 |
| Zmask | Prior Knowledge | 56.71 | 6.43 | 417 |
| Jedi | Prior Knowledge | 58.85 | 18.96 | 349 |
Ablation Study¶
Comparison of defense performance before and after retraining with the APDE dataset (average AP@0.5 on YOLOv3 + FRCNN):
| Attack Method | SAC (Original) | SAC (Retrained) | NAPGuard (Original) | NAPGuard (Retrained) |
|---|---|---|---|---|
| T-SEA | 51.82 | 71.61 | 83.61 | 86.31 |
| TC-EGA | 58.16 | 71.36 | 68.51 | 85.30 |
| AdvPatch | 56.53 | 73.29 | 78.45 | 85.10 |
| GNAP (Natural Patch) | 70.03 | 76.86 | 78.96 | 85.42 |
| AdvCloak (Out-of-domain) | 4.17 | 71.29 | 52.21 | 73.16 |
| AdvTshirt (Out-of-domain) | 34.27 | 64.47 | 50.21 | 70.89 |
An average gain of 15.09% AP@0.5 is observed, with particularly significant improvements on out-of-domain patches.
Key Findings¶
- The difficulty of defending against natural patches stems from data distribution, not high-frequency components: The high-frequency components of NAP (natural adversarial patches) and non-NAP patches differ marginally, whereas their FID distances are substantially larger. Defense methods fundamentally rely on data distribution to determine whether a pixel belongs to a patch region.
- Patch detection accuracy ≠ defense effectiveness: NAPGuard achieves the highest detection accuracy yet performs worse than NutNet in terms of defense; AP reflects defense performance more faithfully than mIoU.
- Adaptive attacks can bypass most defenses: PAD (utilizing the complex SAM model) and DIFFender (leveraging stochastic diffusion) exhibit greater robustness; Zmask and Jedi, which exploit universal patch properties (feature over-activation and high entropy), also demonstrate relative robustness.
- Physical-world defense is effective: Methods that perform well in the digital domain generally transfer to the physical world; increased distance and enhanced illumination tend to benefit defense.
- Multi-patch scenarios: The performance of certified defenses degrades more gradually as the number of patches increases, but computational cost grows exponentially.
Highlights & Insights¶
- First systematic patch defense benchmark: Unifies the evaluation paradigm, resolving the long-standing issue of incomparable results across papers.
- Novel data-distribution perspective: Challenges the prevailing belief that high-frequency features are the primary reason natural adversarial patches are difficult to defend against.
- Practicality-oriented: Beyond evaluation, the APDE dataset can be directly leveraged to improve the performance of existing defenses.
- NutNet achieves the best overall performance: It offers the best combination of defense effectiveness, inference speed, and robustness.
Limitations & Future Work¶
- The study primarily focuses on pedestrian detection; generalizability to other object categories remains to be verified.
- Certified defenses are constrained by strict threat model assumptions (patch count and size), limiting practical applicability.
- Physical-world experiments were conducted exclusively with an iPhone 16 Pro, offering limited sensor diversity.
- More complex scenarios such as 3D adversarial attacks and inter-frame consistency in video are not covered.
Related Work & Insights¶
- The data distribution perspective can inspire the design of novel defense methods, such as using distributional distance to identify patch regions.
- The success of generative model-based defenses (NutNet, DIFFender) suggests that diffusion model-based image inpainting and denoising may constitute a promising defense paradigm.
- The dataset construction methodology of APDE can be generalized to other adversarial robustness research domains.
Rating¶
- Novelty: ⭐⭐⭐⭐ First unified benchmark; the data distribution findings are of substantial value.
- Experimental Thoroughness: ⭐⭐⭐⭐⭐ 11 defenses × 13 attacks × 11 detectors — extremely comprehensive.
- Writing Quality: ⭐⭐⭐⭐ Clear structure, in-depth analysis, and convincing findings.
- Value: ⭐⭐⭐⭐⭐ The dataset and benchmark represent a significant contribution to the field.