De-Anonymization at Scale via Tournament-Style Attribution¶

Conference: ACL 2026 arXiv: 2601.12407 Code: None Area: AI Safety / Privacy Keywords: Authorship Attribution, De-Anonymization, LLM Privacy Threat, Tournament-Style Matching, Peer Review

TL;DR¶

This paper proposes DAS (De-Anonymization at Scale), an LLM-based large-scale authorship de-anonymization method that combines tournament-style elimination, dense retrieval pre-filtering, and multi-round voting aggregation to perform author matching across tens of thousands of candidate texts, revealing the privacy threat that LLMs pose to anonymous platforms such as double-blind peer review.

Background & Motivation¶

Background: Traditional authorship attribution (AA) has been studied in closed-set, small-scale settings—given a small number of candidate authors and labeled samples, a classifier is trained for attribution. However, real-world anonymous systems (e.g., academic peer review) may involve tens of thousands of candidates with no labeled data.

Limitations of Prior Work: (1) Traditional methods are infeasible at large scale, as they require constructing an author profile for each candidate; (2) recent work employing GPT-3/4 for authorship attribution remains limited to small candidate sets; (3) the text analysis capabilities of LLMs may render large-scale de-anonymization a realistic threat.

Key Challenge: Anonymous systems (e.g., double-blind peer review, whistleblower forums) rely on identity concealment to protect fairness and safety, yet LLMs may identify anonymous authors by analyzing writing style, domain expertise, and other signals.

Goal: To develop an LLM-based author matching method that operates practically over candidate pools of tens of thousands of texts, and to assess the degree of threat it poses to anonymous systems.

Key Insight: Large-scale author matching is framed as a tournament-style elimination competition—candidates are randomly grouped, and the LLM selects the most probable match within each group; winners advance to the next round, ultimately producing a ranked list.

Core Idea: Progressive elimination + dense retrieval pre-filtering + multi-round voting aggregation = large-scale de-anonymization within a constrained token budget.

Method¶

Overall Architecture¶

DAS comprises three components: (1) dense retrieval pre-filtering—embedding-based retrieval reduces a candidate pool of \(10^5\) to \(10^3\); (2) tournament-style elimination—candidates are divided into fixed-size groups, the LLM selects the most probable match within each group, and winners are re-grouped and iteratively compared until a top-\(k\) ranking is produced; (3) multi-round voting aggregation—multiple independent runs (with different random groupings) accumulate scores for each candidate's wins, and the aggregated scores yield the final ranking.

Key Designs¶

Tournament-Style Progressive Elimination:
- Function: Decomposes one-to-many matching into multiple rounds of small-scale comparisons.
- Mechanism: Candidates are randomly divided into fixed-size groups (e.g., groups of 5); the LLM compares the query text against all candidates within a group and selects the most probable match. Winners are re-grouped in the next round, and this process repeats until convergence to top-\(k\).
- Design Motivation: LLMs have limited context windows and cannot simultaneously compare tens of thousands of candidates; group-wise comparison reduces complexity to logarithmic scale.
Dense Retrieval Pre-Filtering:
- Function: Reduces the search space to a scale manageable by LLMs.
- Mechanism: An embedding model encodes the query and all candidates; vector similarity retrieval selects the top-\(N\) (e.g., 1,000) candidates as input to the subsequent tournament.
- Design Motivation: Reduces the search space from \(10^5\) to \(10^3\), making subsequent LLM comparisons feasible within the token budget.
Multi-Round Voting Aggregation:
- Function: Improves ranking stability and precision.
- Mechanism: The tournament is run multiple times independently (with different random groupings); each run assigns scores to winning candidates, and scores across all rounds are aggregated to produce the final ranking. Candidates that consistently win across different groupings receive higher rankings.
- Design Motivation: A single random grouping may introduce bias due to uneven within-group competition; multi-round aggregation increases robustness.

Loss & Training¶

DAS is a training-free inference-time method that relies solely on the text analysis capabilities of LLMs. The core computation consists of pairwise comparison prompts issued to the LLM.

Key Experimental Results¶

Main Results¶

De-Anonymization Performance on Anonymous Review Data

Setting	Candidate Pool Size	DAS Accuracy	Random Baseline
Peer Review	Thousands	Far above random	~0.01%
Enron Email	Standard benchmark	Outperforms prior methods	-
Blog Posts	Large scale	Outperforms prior methods	-

Ablation Study¶

Component	Effect of Removal	Notes
Dense retrieval pre-filtering	System becomes infeasible	Candidate pool too large
Multi-round voting	Accuracy decreases	Single round is unstable
Tournament elimination	Accuracy decreases	Progressive comparison is necessary

Key Findings¶

DAS successfully identifies co-authored texts among thousands of candidates in anonymous review data, with accuracy far exceeding random chance.
DAS outperforms prior direct LLM prompting methods on standard benchmarks (Enron, blogs).
Multi-round voting substantially improves ranking precision and stability.
Dense retrieval pre-filtering serves not only as an efficiency measure but also improves subsequent matching quality by narrowing the candidate pool.

Highlights & Insights¶

The paper reveals a serious privacy threat—LLMs make large-scale de-anonymization practically feasible.
The tournament-style design elegantly resolves the computational bottleneck of large-scale one-to-many matching.
The methodology is generalizable and can be applied to any text attribution scenario requiring matching from a large candidate pool.

Limitations & Future Work¶

Although accuracy is well above random, it remains limited and may be insufficient to constitute a practical threat in certain settings.
The recall quality of dense retrieval may constrain final accuracy.
As a potential privacy attack tool, corresponding defense mechanisms and ethical discussions are warranted.
The method may have limited discriminative power among stylistically similar authors (e.g., members of the same research group).

vs. Huang et al. (2024a): Prior work uses GPT for small-scale attribution; DAS scales to tens of thousands of candidates.
vs. Traditional AA: Traditional methods require labeled data and small candidate sets; DAS is fully zero-shot and large-scale.
vs. Stylometry: DAS leverages the implicit stylistic analysis capabilities of LLMs without requiring explicit feature engineering.

Rating¶

Novelty: ⭐⭐⭐⭐ The tournament-style large-scale attribution design is novel, and the privacy threat perspective is important.
Experimental Thoroughness: ⭐⭐⭐⭐ Real review data and standard benchmarks are used, though the scale of the anonymous review experiments could be larger.
Writing Quality: ⭐⭐⭐⭐ Motivation is clear and the method is described systematically.
Value: ⭐⭐⭐⭐ Practically significant for evaluating the security of anonymous systems.