🔎 AIGC Detection¶
💬 ACL2026 · 17 paper notes
📌 Same area in other venues: 📷 CVPR2026 (7) · 🔬 ICLR2026 (30) · 🧪 ICML2026 (11) · 🤖 AAAI2026 (2) · 🧠 NeurIPS2025 (9)
🔥 Top topics: LLM ×8
- AEGIS: A Holistic Benchmark for Evaluating Forensic Analysis of AI-Generated Academic Images
-
AEGIS is the first comprehensive benchmark for academic image forgery forensics, covering 7 major academic image categories with 39 subcategories, 4 forgery strategies (entirely fabricated, reference-based rewriting, local inpainting, and local editing), and 25 generative models. It proposes four tasks: forgery scope discrimination, text artifact recognition, manipulation type classification, and tampered pixel localization. Evaluating 25 MLLMs and 9 expert models reveals a structural complementarity: even GPT-5.1 achieves an overall score of only 48.80%, and expert models reach a pixel IoU of only 30.09%, highlighting that "generation evolves faster than forensics" and the trade-off between "MLLM reasoning vs. expert model sensitivity."
- Authorship Attribution in Multilingual Machine-Generated Texts
-
Existing research on machine-generated text authorship attribution (identifying which specific LLM or human produced a text) is almost entirely monolingual (primarily English). This paper is the first to formally define Multilingual Authorship Attribution (ML-MGT) and Cross-Lingual Authorship Attribution (CL-MGT). Through a systematic evaluation of 18 languages \(\times\) 8 generators (7 LLMs + human) using statistical methods, fine-tuned encoders, contrastive learning, and fine-tuned decoders, it finds that while fine-tuned/contrastive methods adapt well to multiple languages (best macro-F1 > 0.9), they degrade severely when transferring across different language families or writing systems, revealing the challenges of real-world multilingual scenarios.
- Beyond the Final Actor: Modeling the Dual Roles of Creator and Editor for Fine-Grained LLM-Generated Text Detection
-
Ours proposes RACE (Rhetorical Analysis for Creator-Editor Modeling), which utilizes Rhetorical Structure Theory (RST) to construct logic graphs for modeling the thought architecture of the "Creator," while extracting discourse unit-level features to capture the linguistic style of the "Editor." This enables four-way fine-grained LLM-generated text detection (Human-written / LLM-generated / LLM-polished Human / Human-rewritten LLM).
- BIASEDTALES-ML: A Multilingual Dataset for Analyzing Narrative Attribute Distributions in LLM-Generated Stories
-
BiasedTales-ML constructs a corpus of approximately 350,000 LLM-generated children's stories across 8 languages. Through a factorial prompt design and a distributional analysis framework, it reveals that the distribution of social attributes in narratives varies significantly across different languages, and English-centric evaluations fail to reflect bias patterns in multilingual scenarios.
- C-ReD: A Comprehensive Chinese Benchmark for AI-Generated Text Detection Derived from Real-World Prompts
-
C-ReD constructs a Chinese AI-generated text detection benchmark covering five writing scenarios, nine LLM generators, and real-world prompts. It demonstrates that detection difficulty depends heavily on the domain, generator, and prompt, while fine-tuning on C-ReD significantly enhances generalization to unseen models and external Chinese data.
- Can AI-Generated Persuasion Be Detected? Persuaficial Benchmark and AI vs. Human Linguistic Differences
-
This paper introduces Persuaficial—a high-quality multilingual benchmark for AI-generated persuasive text covering six languages. It systematically evaluates the differences in automatic detection difficulty between LLM-generated and human-written persuasive texts, finding that subtle AI persuasion is significantly harder to detect than human persuasion (\(F_1\) drops by approximately 20%), whereas overly intensified persuasion is actually easier to identify.
- DetectRL-X: Towards Reliable Multilingual and Real-World LLM-Generated Text Detection
-
DetectRL-X constructs a benchmark containing 3.456 million samples across multiple languages, domains, attacks, and lengths with parallel binary/ternary classification, proving that existing detectors still have significant robustness gaps in real-world multilingual and human-AI collaborative writing scenarios.
- ExaGPT: Example-Based Machine-Generated Text Detection for Human Interpretability
-
ExaGPT reframes the task of "determining whether a text is human-written or LLM-generated" as "identifying which side has more similar spans in a data store." By utilizing BERT embeddings, k-NN retrieval, and dynamic programming for optimal span segmentation, it provides interpretable evidence (most similar retrieved span examples) while improving accuracy by up to \(+37.0\) points over previous explainable detectors at 1% FPR.
- Frame In, Frame Out: Measuring Framing Bias in LLM-Generated News Summaries
-
This paper proposes FIFO, a method that uses an LLM jury with expert calibration to measure whether LLM news summaries introduce framing bias on XSum at scale. It finds that several high-capacity models exhibit higher proportions of framed expressions compared to human summary baselines.
- From Scoring to Explanations: Evaluating SHAP and LLM Rationales for Rubric-based Teaching Quality Assessment
-
This paper proposes a sentence-level explanation evaluation framework for automated rubric scoring. Comparing fine-tuned PLMs, prompted LLMs, SHAP attribution, and LLM rationales on a classroom feedback quality scoring task, the study finds that fine-tuned PLMs are more accurate, while SHAP provides more faithful and transferable explanations than LLM-generated ones.
- GigaCheck: Detecting LLM-generated Content via Object-Centric Span Localization
-
GigaCheck is proposed as a dual-strategy framework: it utilizes a fine-tuned LLM for document-level classification and innovatively treats AI-generated text spans as "objects," employing a DETR-like architecture to achieve end-to-end, character-level localization.
- MASH: Evading Black-Box AI-Generated Text Detectors via Style Humanization
-
This paper proposes MASH (Multi-Stage Style Humanization Alignment), which utilizes a three-stage pipeline consisting of style-injection SFT → DPO alignment → inference-time refinement. By training a rewriter with only 0.1B parameters, it evades AI text detectors with an average attack success rate of 92% in black-box settings while maintaining excellent linguistic quality.
- mdok-style at SemEval-2026 Task 10: Finetuning LLMs for Conspiracy Detection
-
The authors port the finetuning paradigm of their PAN@CLEF2025-winning machine-generated text (MGT) detector, mdok, to conspiracy detection: four types of data augmentation (anonymization, case variation, homoglyphs, and deduplication) are used to expand the training set, followed by a round of self-training (retaining only high-confidence pseudo-labels where \(p \ge 0.99\) or \(p \le 0.01\)). Qwen3-32B is then finetuned using QLoRA 4-bit PEFT, ultimately achieving a Macro F1 = 0.78 and ranking 8/52 (85th percentile) in SemEval-2026 Task 10 subtask 2.
- REFLEX: Self-Refining Explainable Fact-Checking via Verdict-Anchored Style Control
-
REFLEX binds verdict prediction and explanation generation in fact-checking by constructing internal steering vectors from self-disagreement samples between the backbone and fine-tuned models. Without relying on search APIs or closed-source teacher models, it improves the verdict Macro-F1 and produces shorter, more consistent, and less misleading explanations.
- Temporal Flattening in LLM-Generated Text: Comparing Human and LLM Writing Trajectories
-
By constructing a longitudinal writing dataset spanning 12 years, this study discovers a "temporal flattening" phenomenon in LLM-generated text—while lexical diversity is high, temporal drift in semantic and cognitive-emotional dimensions is significantly lower than in humans. Human and LLM texts can be distinguished with 94% accuracy based solely on temporal variation patterns.
- When Personalization Tricks Detectors: The Feature-Inversion Trap in Machine-Generated Text Detection
-
This work reveals the "Feature-Inversion Trap" of MGT detectors in personalized scenarios—where features distinguishing human-written text (HWT) and machine-generated text (MGT) in general domains invert in personalized domains, causing detector performance to collapse or even flip. The authors propose the StyloCheck framework to predict cross-domain performance changes by quantifying the detector's reliance on inverted features, achieving a prediction correlation of over 0.85.
- Who Wrote This Line? Evaluating the Detection of LLM-Generated Classical Chinese Poetry
-
This paper constructs ChangAn (containing 30,664 poems), the first detection benchmark for LLM-generated classical Chinese poetry. It systematically evaluates 12 AI detection methods across various text granularities and generation strategies, revealing the severe limitations of current Chinese text detectors in the classical poetry domain.