Skip to content

👥 Social Computing

💬 ACL2026 · 45 paper notes

📌 Same area in other venues: 📷 CVPR2026 (3) · 🔬 ICLR2026 (17) · 🧪 ICML2026 (9) · 🤖 AAAI2026 (10) · 🧠 NeurIPS2025 (20) · 📹 ICCV2025 (4)

🔥 Top topics: LLM ×15 · Speech & Audio ×3 · RAG ×2 · Agents ×2 · Multimodal/VLM ×2

Among Us: Language of Conspiracy Theorists on Mainstream Reddit

Analyzing 10 years of longitudinal data from 510 million Reddit comments, the study finds that users active in conspiracy communities exhibit detectable unique linguistic patterns even in mainstream communities (average 87% classification accuracy). However, these patterns are highly dependent on community context, with community-specific models outperforming global models by up to 17 percentage points.

Bayesian Social Deduction with Graph-Informed Language Models

This paper proposes GRAIL (Graph Reasoning Agent Informed through Language), a hybrid reasoning framework that externalizes probabilistic reasoning to a factor graph model while utilizing LLMs for language understanding and interaction. GRAIL defeated human players for the first time in the social deduction game Avalon (67% win rate), with resource consumption significantly lower than large-scale reasoning models.

Beyond the Crowd: LLM-Augmented Community Notes for Governing Health Misinformation

The authors perform an empirical analysis of 30.8K health-related Community Notes from X, revealing systematic slow-response issues: a median delay of 17.6 hours for the first helpful verdict and 87.9% of notes remaining unrated. They propose the CrowdNotes+ framework, utilizing (1) Evidence Augmentation and (2) Utility-Guided Automation modes for LLM-generated notes, paired with a "Relevance → Correctness → Helpfulness" three-stage evaluation. 15 LLMs on the new HealthNotes benchmark significantly outperform the 73.19% helpfulness of human notes (with the o3 model reaching 81.15%).

BITS Pilani at SemEval-2026 Task 9: Structured Supervised Fine-Tuning with DPO Refinement for Polarization Detection

This paper proposes a two-stage pipeline consisting of "structured slot-filling SFT + DPO preference optimization" for the SemEval-2026 POLAR polarization detection task (English subset). The Qwen2.5-7B system submitted during the competition achieved a Macro-F1 of 0.7664. Post-competition, replacing the base model with Mistral-Nemo-12B and using preference pairs filtered by an LLM-judge improved the Macro-F1 to 0.8162, surpassing the organiser baseline (0.7802).

Building Arabic NLP from the Ground Up: Twenty Years of Lessons, Failures, and Open Problems

This is a reflective paper rather than an experimental one. The authors review twenty years of Arabic NLP construction, pointing out that the most difficult problems in low-resource languages are often not linguistics or model technology, but community, institutions, deployment governance, and modes of knowledge production.

ClaimDB: A Fact Verification Benchmark over Large Structured Data

ClaimDB is the first fact-verification benchmark to scale evidence to 80 real-world databases, averaging 11 tables / 4.6 million rows / 110 million tokens per claim. This forces methods to utilize executable programs (SQL) for compositional reasoning. Evaluations of tool-calling agents across 30 SOTA LLMs reveal that over half have an accuracy below 55%; closed models rarely "abstain," while open-source models over-abstain, identifying NEI handling as the primary weakness.

Confident, Calibrated, or Complicit: Safety Alignment and Ideological Bias in LLM Hate Speech Detection

The authors evaluated 5 LLMs (strongly aligned vs. weakly aligned) under 4 political personas on the Latent Hatred benchmark using zero-shot classification. They found that strongly aligned models achieved higher strict accuracy (69.0%) compared to weakly aligned ones (64.1%) and were nearly immune to persona manipulation. However, all models exhibited systematic failures in handling irony, target group fairness, and confidence calibration.

Content Fuzzing for Escaping Information Cocoons on Social Media

Proposes ContentFuzz, a confidence-guided fuzzing framework from the content creator's perspective. It uses LLMs to rewrite posts to flip machine-inferred stance labels while keeping human-interpreted meaning unchanged, thereby breaking social media information cocoons.

Decide less, communicate more: On the construct validity of end-to-end fact-checking in medicine

The authors conducted an annotation study of 1,000 instances using 5 clinical experts on authentic claims from RedHOT (Reddit Health Discussions). They found that end-to-end medical fact-checking lacks construct validity due to three insurmountable barriers: difficulties in linking evidence, underspecified claims, and subjective severity judgments. Consequently, the paper proposes reframing medical fact-checking as an "interactive clinician-patient communication model" rather than a "classification-then-verdict" pipeline.

DIA-HARM: Dialectal Disparities in Harmful Content Detection Across 50 English Dialects

This paper constructs DIA-HARM, the first benchmark to evaluate the robustness of misinformation detection across 50 English dialects. It reveals that human-written dialectal content leads to a performance drop of 1.4-3.6% F1, while fine-tuned Transformers significantly outperform zero-shot LLMs (96.6% vs 78.3%). Furthermore, some models exhibit catastrophic degradation exceeding 33% on mixed content.

Diagnosing LLM Arbitration Behavior over Pre-evidence Epistemic States in RAG-based Fact-Checking

To be added after further reading.

Dynamics of Cognitive Heterogeneity: Investigating Behavioral Biases in Multi-Stage Supply Chains with LLM-Based Simulation

This study utilizes LLM agents (DeepSeek/GPT series) to simulate multi-stage supply chains in the classic Beer Distribution Game. It systematically investigates the impact of cognitive heterogeneity (differences in reasoning capabilities) on system behavior, finding that LLM agents replicate human-like bullwhip effects and myopic behavior, while information sharing effectively mitigates these negative effects.

Estimating the Black-box LLM Uncertainty with Distribution-Aligned Adversarial Distillation

DisAAD is proposed: a small proxy model (only 1% of the target model's size) learns whether a "black-box LLM knows the answer" through "distribution alignment + adversarial distillation." By leveraging Evidential Deep Learning (EDL) to decompose proxy logits into epistemic and aleatoric uncertainty, real-time uncertainty for closed-source models like GPT-4/Claude can be estimated with a single response. This achieves an average AUROC improvement of 18.2% and AUPR improvement of 22.9% over black-box baselines.

Explain the Flag: Contextualizing Hate Speech Beyond Censorship

This paper proposes a hybrid approach combining LLMs with manually curated vocabularies in three languages (English, French, and Greek) to detect and explain hate speech. The "term pipeline" identifies inherently derogatory terms through vocabulary matching and LLM semantic disambiguation, while the "no-term pipeline" employs LLMs to detect group-targeted content; both are integrated to generate evidence-based explanations.

FigSIM: A Dataset for Fine-grained Suicide Severity and Figurative Language in Suicide Memes

FigSIM constructs the first fine-grained multimodal dataset for suicide-related memes, annotating figurative phenomena, suicide severity, and suicide-related content. Experiments with 16 types of models verify that current models systematically underestimate high-severity risks involving metaphors, irony, and sarcasm.

GKnow: Measuring the Entanglement of Gender Bias and Factual Gender

This paper introduces the GKnow benchmark and a suite of circuit-neuron dual-level mechanistic analyses. It demonstrates that "stereotypical gender" and "factual gender" in LLMs exhibit significantly overlapping IoU and high cross-task faithfulness at the circuit level, while sharing the same high-IG neurons at the neuron level. Consequently, simple "ablation of bias neurons" simultaneously weakens factual gender capabilities. While this appears as "successful debiasing" on bias-only benchmarks, it warns that such debiasing is unreliable.

Imperfectly Cooperative Human-AI Interactions: Comparing the Impacts of Human and AI Attributes in Simulated and User Studies

Through a dual-framework experiment involving 2000 LLM simulations and a 290-person user study, this paper compares the impacts of human personality traits and AI design attributes in imperfectly cooperative scenarios (recruitment negotiation, partially honest transactions). Findings indicate that while personality traits dominate in simulations, AI transparency is the key driver in real-world human experiments.

Inertia in Moral and Value Judgments of Large Language Models

This paper systematically measures "Value Inertia" across 7 mainstream LLMs using a "Large-scale random persona × Moral/Value questionnaire" paradigm. It finds highly stable inertia in the Harm/Fairness dimensions—where personas struggle to shift the model's response direction—and introduces two quantifiable metrics, Inertia Index and Steerability, to reveal that these preferences are unevenly distributed and aligned with safety training objectives.

Investigating Counterfactual Unfairness in LLMs towards Identities through Humor

This paper systematically investigates counterfactual unfairness in LLMs within humor scenarios by observing changes in model behavior after swapping speaker/listener identities. Results show that jokes told by privileged groups have a refusal rate as high as 67.5%, are 64.7% more likely to be judged as malicious, and receive social harm scores up to 1.5 (on a 5-point scale). This reveals that models internalize fixed social privilege hierarchies rather than performing genuine social reasoning.

Is this chart lying to me? Automating the detection of misleading visualizations

Proposes the Misviz (2,604 real-world misleading visualizations) and Misviz-synth (57,665 synthetic visualizations) benchmarks covering 12 misleading types. Systematically evaluates the performance of MLLMs, rule checkers, and image classifiers in detecting misleading charts, revealing that this task remains highly challenging.

Justice in Judgment: Unveiling (Hidden) Bias in LLM-assisted Peer Reviews

The authors systematically audit LLM peer review bias across 9 LLMs using a counterfactual evaluation that "modifies author metadata without changing paper content." They find all models exhibit significant favoritism toward Ranked-Stronger (RS) institutions and higher tolerance for senior PIs and prolific authors. Crucially, even when models appear neutral in hard ratings, soft ratings (expected scores based on token probabilities) reveal much stronger hidden bias, highlighting an alignment failure where "alignment masks rather than eliminates preferences."

LiveFact: A Dynamic, Time-Aware Benchmark for LLM-Driven Fake News Detection

LiveFact upgrades "fake news detection" from static binary classification to a dynamic reasoning benchmark updated monthly with time-sliced evidence. It evaluates both factual judgment and cognitive humility ("knowing when to say I don't know") via a dual Classification + Inference mode, while explicitly monitoring benchmark contamination through SSA entity substitution.

mdok-style at SemEval-2026 Task 9: Finetuning LLMs for Multilingual Polarization Detection

This paper transfers the mdok system (originally designed for multilingual machine-generated text detection, utilizing QLoRA finetuned Qwen3-32B / Gemma-3-27B) to SemEval-2026 Task 9 for multilingual polarization detection. By incorporating four types of "dual" data augmentation (anonymization, casing, and homoglyphs), the system achieves an average Macro-F1 score 3–4% higher than the official baseline across 22 languages.

MM-StanceDet: Retrieval-Augmented Multi-modal Multi-agent Stance Detection

The authors reconstruct multimodal stance detection into a 4-stage multi-agent pipeline: CLIP retrieval of similar samples providing few-shot CoT, three expert agents (text/image/cross-modal conflict) for analysis, three debater agents (pro/con/neutral) for debating, and a final adjudicator agent for self-reflection and labeling. Across five datasets, it outperforms strong baselines including GPT-4V, TMPT, and MV-Debate in both in-target and zero-shot settings.

Persona-E2: A Human-Grounded Dataset for Personality-Shaped Emotional Responses to Textual Events

The authors constructed Persona-E2, the first large-scale dataset linking personality traits (MBTI + Big Five) with reader emotional responses. It contains 112,000 annotations from 3,111 events \(\times\) 36 annotators, revealing "personality illusion" in LLMs during simulated emotional responses and demonstrating that Big Five traits mitigate this issue more effectively than MBTI.

Phase Transitions in Affective Meaning Divergence: The Hidden Drift Before the Break

This paper formalizes the phenomenon of "same word, different affective understanding" before dialogue breakdown as Affective Meaning Divergence (AMD). Using entropy-regularized games, it proves that the probability of repair undergoes a saddle-node bifurcation. Empirically, early warning signals of critical slowing down, such as rising variance, are observed in the Conversations Gone Awry (CGA) dataset.

Point of Order: Action-Aware LLM Persona Modeling for Realistic Civic Simulation

This paper converts public Zoom meeting videos into a government deliberation corpus with cross-video traceable speakers, action tags, and persona metadata. By fine-tuning LLMs with QLoRA to generate specific participant utterances, perplexity was reduced by up to 67%, and humans found it difficult to distinguish simulated dialogues from real meeting segments in Turing-style tests.

Probing Multimodal Large Language Models on Cognitive Biases in Chinese Short-Video Misinformation

This paper constructs a high-quality evaluation set of 200 Chinese health short-video rumors (Fine-VDK). By systematically evaluating 8 cutting-edge MLLMs using evidence chains, error types, and social cues, the study finds that Gemini-2.5-Pro is the most stable. However, most models remain susceptible to label bias, authoritative accounts, and traffic metrics in multimodal rumor judgment.

Prompt-Level Distillation: A Non-Parametric Alternative to Model Fine-Tuning for Efficient Reasoning

This paper proposes Prompt-Level Distillation (PLD), which extracts, clusters, and de-conflicts reasoning patterns from a teacher model on training samples to construct a system prompt for the student model. This significantly enhances the reasoning and classification capabilities of small models without updating parameters.

PSK@EEUCA 2026: Fine-Tuning Large Language Models with Synthetic Data Augmentation for Multi-Class Toxicity Detection in Gaming Chat

This system paper for the EEUCA 2026 gaming chat toxicity identification task utilizes Llama 3.1 8B + LoRA + 5% strictly filtered minority-class synthetic paraphrased data, achieving a macro-F1 of 0.6234 across six classes and revealing the "validation trap" where high validation scores fail to transfer to the test set.

Reheat Nachos for Dinner? Evaluating AI Support for Cross-Cultural Communication of Neologisms

Through human experiments involving 234 non-native speakers and 144 native evaluators, this paper compares four types of AI and non-AI support. It finds that AI Explanation with contextual information most effectively improves native speaker ratings of neologisms used by non-native speakers, though a significant misalignment remains between learners' confidence and their actual communicative competence.

RV-HATE: Reinforced Multi-Module Voting for Implicit Hate Speech Detection

RV-HATE decomposes implicit hate speech detection into four BERT contrastive learning modules targeting different data characteristics and uses PPO to learn dataset-specific soft voting weights. It achieves an average macro-F1 of 84.47% across five benchmarks, outperforming SharedCon by an average of 1.8 percentage points.

SMARTER: A Data-efficient Framework to Improve Toxicity Detection with Explanation via Self-augmenting Large Language Models

SMARTER utilizes a small number of labeled samples to prompt LLMs to generate explanations for both correct and incorrect labels. It then enhances explainable toxicity detection through preference optimization and cross-model training. On three datasets, it achieves 86%-100% of the performance of full-data training using only 6%-57% of the training data.

SPAGBias: Uncovering and Tracing Structured Spatial Gender Bias in Large Language Models

This paper proposes the SPAGBias framework, which for the first time systematically evaluates gender bias in LLMs within urban micro-spatial contexts. Through three diagnostic layers—explicit, probabilistic, and constructive—it reveals structured spatial-gender association patterns in LLMs and traces the embedding and amplification of bias throughout the entire model development lifecycle.

Splits! Flexible Sociocultural Linguistic Investigation at Scale

The paper proposes a method to construct a sociolinguistic "sandbox." It introduces Splits!, a dataset of 9.7 million Reddit posts dual-segmented by demographic groups and discussion topics. A two-stage filtering process based on lift and triviality is designed to efficiently screen noteworthy sociocultural linguistic phenomena from 23,000 LLM-generated candidate hypotheses.

Synthia: Scalable Grounded Persona Generation from Social Media Data

The Synthia framework is proposed to generate grounded LLM persona narratives based on real social media posts (Bluesky). It improves social survey alignment by up to 11.6% compared to the SOTA while using smaller models and preserving social network topology to support network-aware analysis.

The Proxy Presumption: From Semantic Embeddings to Valid Social Measures

This paper identifies "Proxy Presumption" in NLP—the practice of naming geometric distances in embeddings as social constructs like "creativity" or "bias"—and proposes a Construct Validity Protocol and Counterfactual Neutralization to transform heuristic proxies into verifiable measurement instruments.

To Lie or Not to Lie? Investigating The Biased Spread of Global Lies by LLMs

This paper proposes GlobalLies—a multilingual parallel dataset containing 440 misinformation generation templates and 6,867 entities (spanning 8 languages and 195 countries). It reveals systematic national and linguistic biases in LLM misinformation propagation: misinformation generation rates are significantly higher for low HDI countries (statistical correlation \(\rho=-0.355\), \(p=5\times10^{-7}\)), compliance rates for low-resource languages are over 30% higher than for English, and existing safety classifiers and RAG safeguards provide uneven protection.

ToxiTrace: Gradient-Aligned Training for Explainable Chinese Toxicity Detection

ToxiTrace proposes an explainable Chinese toxicity detection method for BERT-like encoders. Through three components—CuSA (LLM-guided weak labeling), GCLoss (Gradient Constraint Loss), and ARCL (Adversarial Reasoning Contrastive Learning)—it achieves dual improvements in sentence-level classification accuracy and continuous toxic span extraction while maintaining efficient encoder inference.

Understanding the Sociocultural Dimensions of Mental Health Discourse in Arabic-Language X Communities

This paper employs a GPT-4.1 self-disclosure identification pipeline to filter 8,147 tweets from "lived-experience" authors across three Arabic X (formerly Twitter) mental health communities. Utilizing weighted log-odds, NMF topic modeling, and a six-domain cultural keyword framework, the study characterizes discursive differences in Borderline Personality Disorder (BPD), Bipolar Disorder, and ADHD communities across dimensions such as religion, medicine, relationships, and identity, explicitly positioning all conclusions as "hypothesis generation" rather than "confirmatory results."

VeriTaS: The First Dynamic Benchmark for Multimodal Automated Fact-Checking

VeriTaS utilizes a quarterly updated seven-stage automated pipeline to transform real-world multilingual image-text-video claims from professional fact-checking organizations into a standardized, interpretable, and evaluable multimodal fact-checking benchmark. It demonstrates that the strongest current multimodal models still fall significantly short of reliable AFC.

When Bigger Isn't Better: A Comprehensive Fairness Evaluation of Political Bias in Multi-News Summarisation

This study constructs FairNews, the first multi-document news summarization dataset with political leaning labels, and evaluates 13 LLMs using a five-dimensional fairness framework. It finds that medium-scale models outperform larger models in both fairness and efficiency, and that entity sentiment similarity is the dimension most resistant to prompting-based debiasing.

Who Gets Which Message? Auditing Demographic Bias in LLM-Generated Targeted Text

This paper provides the first systematic analysis of bias in LLMs when generating targeted messages under demographic conditions. Introducing the Persuasion Bias Index (PBI), the study finds that GPT-4o, Llama, and Mistral employ more aggressive persuasion strategies for men and younger audiences in climate communication, with contextual prompts systematically amplifying these disparities.

Why Are We Moral? An LLM-based Agent Simulation Approach to Study Moral Evolution

This paper constructs a prehistoric hunter-gatherer society simulation platform using LLM agents, incorporating moral types, memory, judgment, cooperation, and reproduction into evolutionary experiments. It finds that cooperation and mutual aid generally enhance survival stability, while the cognitive cost of judging others' moral types dictates which moral strategy prevails.

YEZE at SemEval-2026 Task 9: Detecting Multilingual, Multicultural and Multievent Online Polarization via Heterogeneous Ensembling

The YEZE system decomposes the online polarization recognition of 22 languages in SemEval-2026 Task 9 into independent subtasks. By fine-tuning XLM-RoBERTa-large and mDeBERTa-v3-base separately and utilizing weighted probability averaging alongside weighted BCE to alleviate multi-label sparsity, the system achieved stable official Top-10 rankings in fine-grained polarization type and manifestation prediction.