Attribution as Retrieval: Model-Agnostic AI-Generated Image Attribution¶

Conference: CVPR 2026 arXiv: 2603.10583 Code: Available Area: Diffusion Models / Image Generation Keywords: AIGC Attribution, Instance Retrieval, Low-Bit Plane Fingerprint, Few-Shot Learning, Deepfake Detection

TL;DR¶

This paper reframes AI-generated image attribution from a classification paradigm to an instance retrieval paradigm, proposing the LIDA framework. It extracts generator-specific fingerprints from RGB low-bit planes as input, and achieves open-set attribution via unsupervised pre-training on real images followed by few-shot adaptation. LIDA achieves average Rank-1 accuracies of 40.4%/77.5% on GenImage and WildFake under the 1-shot setting, substantially outperforming existing methods.

Background & Motivation¶

The rapid development of AIGC technology has led to a proliferation of image generators (SD, FLUX, Midjourney, etc.), posing serious challenges to the authenticity and provenance of digital media.

Existing attribution methods fall into two categories: (1) Generative watermarking—embedding invisible watermarks during image generation (e.g., Tree-Ring, Gaussian Shading), which achieves high accuracy but requires full access to and modification of the generative model and cannot generalize across different generators; (2) AI-generated image attribution—operating independently of the generation process, but nearly all existing methods treat attribution as a classification problem. Closed-set methods assume all generators are known at training time and cannot adapt to newly emerging models; open-set methods attempt to handle unknown generators but still rely on large amounts of labeled or unlabeled AI-generated images for training, limiting their flexibility.

Background: Rapid iteration of generative models vs. the heavy retraining required by attribution methods.
Key Challenge: A general framework is needed that is independent of generative models and can adapt to new generators with only a handful of registration samples. The key insight of this paper is to redefine attribution as a retrieval problem: only a strong feature encoder needs to be trained, and a new generator can be registered by simply adding a few exemplar images to the database.

Method¶

Overall Architecture¶

LIDA consists of three modules: (1) low-bit fingerprint generation → (2) unsupervised pre-training → (3) few-shot attribution adaptation. A registration database $\mathcal{D}$ of AI-generated images is maintained; at inference time, attribution is performed by retrieving the nearest neighbor via feature similarity.

Key Designs¶

Low-Bit Fingerprint Generation:
Function: Extracts generator-specific structured noise fingerprints from RGB images.
Mechanism: Bit-plane decomposition is applied to each channel $\mathbf{x}_c = \sum_{k=0}^{7} 2^k \cdot \mathbf{b}_c^k$; the lowest 3 bit planes are combined and binarized: $\tilde{\mathbf{x}}_c = 255 \cdot \text{sgn}(\sum_{k=0}^{2} 2^k \cdot \mathbf{b}_c^k)$ This yields a fingerprint image that discards most image content while retaining generator-specific structural noise.
Design Motivation: PCA visualization shows that low-bit fingerprints lead to clear clustering separation across different generators, whereas features from raw RGB images are nearly indistinguishable across generators. Low-bit planes contain model-specific artifacts unintentionally embedded during the generation process.
Unsupervised Pre-Training:
Function: Pre-trains an encoder on low-bit fingerprints of large-scale real images (ImageNet) to learn general noise structure representations.
Mechanism: A ResNet-50 backbone is used (with early-layer downsampling removed to preserve spatial information), and ImageNet classification serves as the proxy task: $\mathcal{L}_P = -\sum_{b=1}^{B}\sum_{c=1}^{C} s_b^c \log q_b^c$
Design Motivation: Pre-training on real image fingerprints provides robust weight initialization; the learned intrinsic noise structures transfer to AI-generated image forensics.
Few-Shot Attribution Adaptation:
Function: Fine-tunes the encoder using a small number of samples from the registration database to distinguish between different generators.
Mechanism: Two loss functions are jointly optimized:
- Image attribution loss (center loss, to avoid disrupting the pre-trained feature space structure): $$\mathcal{L}_A = \sum_{i=1}^{m} \|x_i - c_{y_i}\|_2^2$$ Class centers $c_j$ are dynamically updated within each mini-batch.
- Deepfake detection loss (contrastive loss based on real prototypes): $$\mathcal{L}_D = -\frac{1}{N_r}\sum_{i=1}^{N_r}\log\sigma(\frac{\text{sim}(x_i^r, p_r)}{\tau}) - \frac{1}{N_f}\sum_{j=1}^{N_f}\log(1-\sigma(\frac{\text{sim}(x_i^f, p_r)}{\tau}))$$ Total loss: $\mathcal{L} = \mathcal{L}_A + \lambda \mathcal{L}_D$
Design Motivation: Center loss is preferred over cross-entropy because CE disrupts the feature space structure learned during pre-training. The two-stage attribution strategy (first detecting whether an image is AI-generated, then attributing it to a specific generator) better reflects real-world workflows.

Loss & Training¶

Pre-training: Classification proxy task on ImageNet low-bit fingerprints.
Adaptation: Center loss (attribution) + prototype contrastive loss (detection), requiring only a small number (1/5/10-shot) of AI-generated samples.
Inference: Two-stage pipeline — Deepfake detection first, followed by retrieval-based attribution.

Key Experimental Results¶

Main Results (GenImage Cross-Generator Attribution)¶

Shot	Method	Avg Rank-1	Avg mAP
1-shot	ResNet	17.4	37.5
1-shot	DIRE	14.3	34.8
1-shot	ESSP	17.0	36.0
1-shot	Ours	40.4	61.5
10-shot	ResNet	21.4	22.4
10-shot	DIRE	17.2	28.8
10-shot	ESSP	22.4	23.0
10-shot	Ours	54.0	51.6

WildFake cross-architecture attribution (1-shot): Ours achieves an average Rank-1 of 77.5% and mAP of 87.7%, far surpassing the second-best method at 37.4%/60.4%.

Ablation Study¶

Configuration	Key Metric	Description
RGB input vs. low-bit fingerprint	PCA visualization	Generator features are nearly inseparable in RGB space; low-bit fingerprints yield natural clustering
Unsupervised pre-training	Baseline	ImageNet classification proxy task provides robust initialization
Center loss vs. CE	Attribution quality	Center loss preserves the pre-trained feature space structure; CE disrupts it
Two-stage attribution	Detect then attribute	More reliable than end-to-end classification

Key Findings¶

BigGAN achieves 97–100% Rank-1 under the 1-shot setting, indicating its generation fingerprint is highly distinctive.
Stable Diffusion v1.4/v1.5 versions share similar fingerprints, leading to mutual confusion.
Low-bit plane fingerprint extraction is computationally trivial (bitwise operations only), adding negligible inference overhead.
Usable accuracy is achieved with only 1 shot, demonstrating the practical value of the retrieval paradigm under extremely limited samples.

Highlights & Insights¶

The paradigm shift from classification to retrieval is both natural and highly effective — a new generator requires only a few registered images, with no model retraining.
The low-bit plane fingerprint is a concise yet powerful prior: it is extracted at near-zero cost yet significantly amplifies feature differences across generators.
The insight of using center loss instead of CE to protect the pre-trained feature space is a noteworthy design principle.
The two-stage attribution pipeline (detection before attribution) aligns with real-world workflows and allows each stage to be optimized independently.

Limitations & Future Work¶

Distinguishing between models from the same family (e.g., SD v1.4/v1.5) remains limited and calls for more fine-grained fingerprinting.
The robustness of low-bit planes to image compression (JPEG) is insufficiently discussed, as compression can corrupt low-bit information.
The registration database requires manual maintenance; automatically discovering and registering new generators remains an open problem.
Evaluation is conducted only at the image level; attribution for partially generated regions (e.g., inpainting) is not addressed.

The fundamental distinction from watermarking methods such as Tree-Ring and Gaussian Shading lies in the fact that watermarking requires modifying the generative model, whereas this method is entirely post hoc.
The approach is conceptually analogous to camera fingerprinting (PRNU), effectively transferring the notion of a "camera fingerprint" to a "generator fingerprint."
Low-bit plane analysis can be combined with frequency-domain methods to further enhance discriminability.
The retrieval paradigm is extensible to provenance tracing for AI-generated video and audio.

Rating¶

Novelty: ⭐⭐⭐⭐ The paradigm shift from classification to retrieval is clear and compelling, and the low-bit fingerprint is an effective prior; however, no single technical component is entirely novel.
Experimental Thoroughness: ⭐⭐⭐⭐⭐ Comprehensive evaluation across two large-scale datasets, cross-generator and cross-architecture settings, and 1/5/10-shot configurations.
Writing Quality: ⭐⭐⭐⭐ The problem formulation is clear and the pipeline design is concise.
Value: ⭐⭐⭐⭐ The scalability of the retrieval paradigm and its low sample requirements offer practical value for AIGC security.