👥 Social Computing¶
🔬 ICLR2026 · 17 paper notes
📌 Same area in other venues: 📷 CVPR2026 (3) · 💬 ACL2026 (45) · 🧪 ICML2026 (9) · 🤖 AAAI2026 (10) · 🧠 NeurIPS2025 (20) · 📹 ICCV2025 (4)
🔥 Top topics: LLM ×6
- Adaptive Debiasing Tsallis Entropy for Test-Time Adaptation
-
This paper proposes introducing Tsallis entropy (a generalized form of Shannon entropy) into Test-Time Adaptation (TTA) for VLMs, further developing it into Adaptive Debiasing Tsallis Entropy (ADTE). By customizing the debiasing parameter \(q^l\) for each category, ADTE selects more reliable high-confidence views than Shannon entropy without distribution-specific hyperparameters. It outperforms SOTA on ImageNet, its five variants, and ten cross-domain benchmarks.
- BiasFreeBench: a Benchmark for Mitigating Bias in Large Language Model Responses
-
This paper constructs the BiasFreeBench benchmark, which systematically compares eight mainstream debiasing methods (four prompting + four training) within a unified framework for the first time. Focusing on bias evaluation at the LLM response level, it proposes the Bias-Free Score metric and finds that prompting methods (especially CoT) generally outperform training methods, while DPO shows outstanding generalization across bias types.
- From Five Dimensions to Many: Large Language Models as Precise and Interpretable Psychological Profilers
-
Provided with only 20 Big Five personality item responses of an individual, LLMs are tasked to role-play and predict that individual's responses to 9 other psychological scales. The results show that the "inter-scale correlation structure" reconstructed by LLMs aligns highly with real human data (\(R^2>0.88\)). Analysis of reasoning chains reveals a two-stage abstraction process where LLMs compress raw scores into natural language personality summaries before reasoning—indicating genuine psychological reasoning rather than mere semantic pattern matching.
- GRADIEND: Feature Learning within Neural Networks Exemplified through Biases
-
The authors propose GRADIEND—a gradient-based encoder-decoder architecture that learns interpretable monosemantic features (exemplified by gender) from model gradients through a single bottleneck neuron. It can identify which weights encode specific features and directly modify model weights via the decoder to eliminate bias, achieving SOTA debiasing performance on all baseline models when combined with INLP.
- Human or Machine? A Preliminary Turing Test for Speech-to-Speech Interaction
-
The authors conduct the first Speech Turing Test on nine SOTA speech-to-speech (S2S) systems (2,968 human judgments). The study finds that all systems fail the test (success rates 7%–31%), identifying that the bottleneck lies not in semantic understanding but in paralinguistic features, emotional expression, and dialogue persona. The research also establishes an 18-dimensional fine-grained evaluation framework and an explainable AI judge model.
- INTIMA: A Benchmark for Human-AI Companionship Behavior
-
INTIMA distills three psychological theories—parasocial interaction, attachment, and anthropomorphism—along with qualitative coding of real Reddit user posts into a benchmark containing 31 behaviors and 368 emotional probes. By using LLMs to automatically label model responses as "Reinforcing Companionship," "Maintaining Boundaries," or "Neutral," the study finds that Gemma-3, Phi-4, o4-mini, GPT5-mini, and Claude-4 all significantly lean toward reinforcing companionship. Notably, models tend to set fewer boundaries as user vulnerability increases.
- Language and Experience: A Computational Model of Social Learning in Complex Tasks
-
The authors unify "learning from experience" (theory-based RL, performing Bayesian inference on executable programmable world models) and "learning from others' words" (treating pre-trained LLMs as "speaker models" to convert natural language advice into Bayesian evidence) into a single inference framework. Tested on 10 video games, the model demonstrates that linguistic guidance helps both humans and models learn faster with fewer deaths, while supporting cross-generational knowledge accumulation and human-AI co-teaching.
- Measuring and Mitigating Rapport Bias of Large Language Models under Multi-Agent Social Interactions
-
This paper introduces the KAIROS benchmark, which precisely controls the three axes of "historical rapport × current peer behavior × model confidence" within a quiz-based multi-agent collaboration scenario. It systematically characterizes the decision-making shifts of LLMs under social pressure and finds that only GRPO incorporating multi-agent context and outcome-based rewards can improve accuracy while maintaining social robustness.
- Mitigating Mismatch within Reference-based Preference Optimization
-
Reveals the "premature satisfaction" issue in DPO—where the gradient is unnecessarily decayed by pessimistic signals from the reference policy when it assigns a lower probability to the chosen response than the rejected one (~45% of pairs), even if the policy remains incorrect (\(\Delta_\theta < 0\)). Proposes HyPO (a one-line change: \(\max(0, \Delta_{ref})\) to clip the reference margin), achieving a 41.2% relative improvement over DPO on AlpacaEval 2.0.
- Propaganda AI: An Analysis of Semantic Divergence in Large Language Models
-
Ours proposes the RAVEN audit framework to detect concept-conditioned semantic divergence in LLMs—a propaganda-like behavior pattern where high-level conceptual cues (ideologies, public figures) trigger abnormally consistent stance responses—by combining intra-model semantic entropy and cross-model divergence.
- SAGE: Spatial-visual Adaptive Graph Exploration for Efficient Visual Place Recognition
-
Ours proposes SAGE, a unified VPR training framework: it introduces a lightweight Soft Probing module to enhance local feature discriminativity, reconstructs an online affinity graph merging geographical distance and visual similarity every epoch, and focuses on the hardest samples through greedy weighted clique expansion. By freezing the DINOv2 backbone and training only 1.96M parameters, it achieves comprehensive SOTA across 8 benchmarks.
- Scalable Multi-Task Low-Rank Model Adaptation
-
Ours systematically analyzes the root causes of multi-task LoRA collapse as the number of tasks increases (uniform regularization destroying shared knowledge + component-level LoRA amplifying gradient conflicts) and proposes mtLoRA. By combining spectral-aware regularization, block-level adaptation, and fine-grained routing, mtLoRA outperforms SOTA by an average of 2.3% across 15-25 tasks while reducing parameters by 47% and training time by 24%.
- SocialHarmBench: Revealing LLM Vulnerabilities to Socially Harmful Requests
-
This work proposes SocialHarmBench, the first safety evaluation benchmark specifically targeting socio-political harms. It consists of 585 prompts covering 7 domains and 34 countries, revealing systematic safety vulnerabilities of current LLMs in politically sensitive scenarios such as historical revisionism and propaganda manipulation.
- Statistical Guarantees in the Search for Less Discriminatory Algorithms
-
This paper formalizes the corporate process of searching for a "Less Discriminatory Alternative" (LDA) to comply with anti-discrimination laws as an optimal stopping problem. It provides an adaptive stopping algorithm that, under realistic conditions of unknown model distributions and finite evaluation data, provides a high-confidence upper bound on the marginal reduction in disparate impact from further retraining. This allows companies to stop when gains no longer justify the costs and issue a statistical certificate of "sufficient search" to regulators or legal teams.
- Steering the Herd: A Framework for LLM-Based Control of Social Learning
-
This paper formalizes "LLMs acting as information intermediaries" as a controlled sequential social learning model. In this framework, a planner can only regulate the precision of each individual's private signal (without falsification or selection bias), while individuals update public beliefs by observing both private signals and the actions of their predecessors. The authors prove the convexity of the altruistic planner's value function and characterize the optimal strategies for both altruistic and biased planners (where the latter may actively "blur" information). Simulations involving LLMs acting as both planners and agents demonstrate that emerging LLM planner strategies align closely with theoretical optima.
- The Value of Information in Human-AI Decision-Making
-
This paper proposes a framework based on Bayesian decision theory that uses "Value of Information" to quantify the maximum expected utility gain brought by each signal (AI predictions, human judgments, instance features) relative to existing decisions. Based on this, it designs a new explanation method, ILIV-SHAP, which highlights "human-complementary information." Experiments in house price prediction demonstrate that it improves human-AI team decision accuracy more effectively than standard SHAP.
- Tracing and Reversing Edits in LLMs
-
Addressing the dual-use risks of knowledge editing (KE), this paper proposes EditScope to infer edited target entities from weights (up to 99% accuracy) and a training-free edit reversal method based on SVD bottom-rank approximation (up to 94% reversal rate), relying solely on edited weights without requiring edit prompts or original weight information.