🔎 AIGC Detection¶

💬 ACL2025 · 15 paper notes

📌 Same area in other venues: 📷 CVPR2026 (10) · 🔬 ICLR2026 (30) · 💬 ACL2026 (17) · 🧪 ICML2026 (11) · 🤖 AAAI2026 (2) · 🧠 NeurIPS2025 (9)

🔥 Top topics: LLM ×6 · Adversarial Robustness ×2

A Rose by Any Other Name: LLM-Generated Explanations Are Good Proxies for Human Explanations to Collect Label Distributions on NLI: This paper proposes using LLM-generated NLI explanations to substitute expensive human explanations for approximating Human Judgment Distributions (HJD). Experiments demonstrate that with the guidance of human label distributions, LLM-generated explanations achieve comparable performance to human explanations across metrics like KL divergence and JSD. Furthermore, the approach generalizes well to datasets without human explanations (MNLI) and out-of-domain test sets (ANLI).
Are We in the AI-Generated Text World Already? Quantifying and Monitoring AIGT on Social Media: This work offers the first large-scale quantification of the changing proportion of AI-Generated Text (AIGT) on social media. By collecting 2.4 million posts across Medium, Quora, and Reddit, constructing the AIGTBench dataset, and training the optimal detector OSM-Det, the study reveals that the AIGT proportion on Medium and Quora zoomed from ~2% to ~37-39% between 2022 and 2024, whereas Reddit's proportion only increased from 1.3% to 2.5%.
An Empirical Study on Detecting AI-Generated Text in Financial Reports: Focusing on the highly regulated domain of financial reports, this paper systematically evaluates the performance of various AI-generated text detection methods (statistical features, neural network classifiers, watermark detection, etc.) in identifying AI-generated content in financial documents, revealing the significant impact of domain specificity on detection effectiveness.
People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text: Through an experiment with 1,740 annotations, it was found that human annotators who frequently use LLMs for writing tasks can detect AI-generated text with extremely high accuracy (only 1/300 errors via a 5-person majority vote). Even when facing paraphrasing and humanization evasion strategies, they perform significantly better than most automated detectors.
ChemActor: Enhancing Automated Extraction of Chemical Synthesis Actions with LLM-Generated Data: This paper proposes ChemActor, a fully fine-tuned LLM chemical executor, which addresses the data scarcity issue in chemical synthesis action extraction through a sequential LLM-generated data framework and a distribution divergence-based data selection module, outperforming baseline models by 10% on R2D and D2A tasks.
Cognitive Framework for Detecting AI-Generated Fiction: This paper proposes an AI-generated novel/fiction detection framework based on cognitive linguistic features. By modeling cognitive patterns in human creative writing (such as narrative rhythm, emotional arc, and metaphor density), the framework distinguishes between human-written and AI-generated fictional texts, significantly outperforming existing detection methods in long-text scenarios.
Iron Sharpens Iron: Defending Against Attacks in Machine-Generated Text Detection with Adversarial Training: This paper proposes the GREATER adversarial training framework, which simultaneously trains an adversarial attacker (Greater-A) and an MGT detector (Greater-D). The attacker identifies critical tokens through surrogate model gradients and perturbs them in the embedding space to generate adversarial samples. The detector learns generalized defense from these curriculum-style adversarial samples. Under 16 attacks, the ASR drops to 5.53% (compared to SOTA's 6.20%), while the attack efficiency is 4 times faster than SOTA.
HACo-Det: A Study Towards Fine-Grained Machine-Generated Text Detection under Human-AI Coauthoring: This study proposes HACo-Det, a fine-grained machine-generated text (MGT) detection benchmark tailored for human-AI collaborative writing. By employing a multi-round local rewriting pipeline, it automatically constructs 11,200 human-AI coauthored texts with word-level attribution labels. It adapts seven mainstream detectors into a word-level sequence labeling formulation for systematic evaluation, revealing significant room for improvement in current fine-grained detection methods.
KatFishNet: Detecting LLM-Generated Korean Text through Linguistic Feature Analysis: This paper constructs the first Korean LLM-generated text detection benchmark, KatFish (covering three genres and four LLMs). By analyzing three types of Korean linguistic features—word spacing, POS diversity, and comma usage—the authors propose the KatFishNet detection method, achieving an average AUROC 19.78% higher than the best baseline under the OOD (unseen LLM) setting.
Learning to Rewrite: Generalized LLM-Generated Text Detection: The Learning2Rewrite (L2R) framework is proposed, which fine-tunes an LLM-based rewriting model to amplify the difference in rewrite edit distance between human-written and AI-generated text, thereby achieving highly generalized AI text detection across domains. L2R achieves an average AUROC of 0.9009 across 21 independent domains, outperforming RAIDAR by 4.67% and direct classification fine-tuning by 51.35% in out-of-distribution tests.
Comparing LLM-generated and human-authored news text using formal syntactic theory: This study is the first to systematically investigate the grammatical differences between 6 LLMs and human NYT news writing across three levels—syntactic constructions (298 types), lexical types (1,398 types), and morphological rules (100 types)—using HPSG formal syntactic theory (via the English Resource Grammar, ERG). The findings reveal that LLMs represent an "averaged" projection of human authors in terms of grammatical features: grammatical differences among individual human authors are actually greater than any difference between humans and LLMs, while LLMs exhibit almost no differences among themselves.
Low-Perplexity LLM-Generated Sequences and Where To Find Them: This paper proposes a systematic pipeline to analyze low-perplexity sequences (token prediction probability \(\ge 0.9\)) generated by LLMs and trace them back to training data sources. It is found that 30-60% of low-perplexity segments cannot be matched to the training data, and the matchable segments are categorized into four types of memorization behaviors.
MultiSocial: Multilingual Benchmark of Machine-Generated Text Detection of Social-Media Texts: Constructed MultiSocial (472k texts), the first large-scale machine-generated text detection benchmark covering 22 languages, 5 social media platforms, and 7 LLM generators. Experimental results show that fine-tuned detectors (Llama-3-8B/Mistral-7B, AUC ROC 0.977) perform exceptionally well on social media texts, and the choice of training platforms significantly impacts cross-platform generalization.
Reliably Bounding False Positives: A Zero-Shot Machine-Generated Text Detection Framework via Multiscaled Conformal Prediction: Proposes a zero-shot machine-generated text detection framework based on Multiscaled Conformal Prediction (MCP). By calculating text-length-aware grouped quantiles, it significantly improves detection performance while strictly bounding the False Positive Rate (FPR) upper limit, and constructs RealDet, a large-scale bilingual benchmark dataset covering 15 domains and 22 LLMs.
Who Writes What: Unveiling the Impact of Author Roles on AI-generated Text Detection: Reveals that authors' sociolinguistic attributes (gender, CEFR level, academic discipline, language environment) systematically affect the accuracy of AI-generated text detectors, with language proficiency and environment causing the most prominent and consistent biases. A multi-factor WLS+ANOVA bias quantification framework is proposed.