Skip to content

👥 Social Computing

🔬 ICLR2026 · 11 paper notes

Adaptive Debiasing Tsallis Entropy for Test-Time Adaptation

This paper introduces Tsallis entropy (a generalization of Shannon entropy) into Test-Time Adaptation for vision-language models, and further develops Adaptive Debiasing Tsallis Entropy (ADTE), which customizes a per-class debiasing parameter \(q^l\) to select more reliable high-confidence views than Shannon entropy without distribution-specific hyperparameter tuning. ADTE surpasses the state of the art on ImageNet and its 5 variants as well as 10 cross-domain benchmarks.

BiasFreeBench: a Benchmark for Mitigating Bias in Large Language Model Responses

This paper presents BiasFreeBench, the first unified framework to systematically compare 8 mainstream debiasing methods (4 prompting + 4 training) at the response level for LLMs. It introduces the Bias-Free Score (BFS) metric and finds that prompting methods—particularly CoT—generally outperform training-based approaches, while DPO demonstrates superior cross-bias-type generalization.

Functional Embeddings Enable Aggregation of Multi-Area SEEG Data for Robust BCI

This paper proposes FunctionalMap, a framework that uses contrastive learning to learn subject-agnostic functional embeddings from intracranial local field potentials (LFPs) as a "functional coordinate system," replacing unreliable MNI anatomical coordinates. Combined with a Transformer, it enables cross-subject and cross-electrode aggregation of neural data and signal reconstruction, validated on a multi-area SEEG dataset from 20 subjects.

Functional Embeddings Enable Aggregation of Multi-Area SEEG Data for Robust BCI

This paper proposes FunctionalMap, a framework that learns subject-agnostic functional embeddings from intracranial local field potentials (LFPs) via contrastive learning, serving as a "functional coordinate system" to replace unreliable MNI anatomical coordinates. Combined with a Transformer, the framework enables cross-subject and cross-electrode neural data aggregation and signal reconstruction, validated on a multi-area SEEG dataset from 20 subjects.

GRADIEND: Feature Learning within Neural Networks Exemplified through Biases

This paper proposes GRADIEND — a gradient-based encoder-decoder architecture that learns interpretable monosemantic features (exemplified by gender) from model gradients via a single bottleneck neuron. The framework not only identifies which weights encode a specific feature, but also directly modifies model weights through the decoder to mitigate bias. Combined with INLP, it achieves state-of-the-art debiasing results across all baseline models.

Human or Machine? A Preliminary Turing Test for Speech-to-Speech Interaction

This work conducts the first speech-based Turing test on 9 state-of-the-art speech-to-speech (S2S) dialogue systems, collecting 2,968 human judgments. Results show that all systems fail the test (pass rates of 7%–31%). The primary bottlenecks lie not in semantic understanding but in paralinguistic features, emotional expression, and conversational persona. The study also introduces an 18-dimensional fine-grained evaluation framework and an interpretable AI judge model.

Propaganda AI: An Analysis of Semantic Divergence in Large Language Models

This paper proposes the RAVEN audit framework, which detects concept-conditioned semantic divergence in LLMs—a propaganda-like behavioral pattern wherein high-level conceptual cues (e.g., ideologies, public figures) trigger anomalously consistent stance responses—by combining intra-model semantic entropy with cross-model divergence analysis.

SAGE: Spatial-visual Adaptive Graph Exploration for Efficient Visual Place Recognition

This paper proposes SAGE, a unified VPR training framework that introduces a lightweight Soft Probing module to enhance local feature discriminability, reconstructs an affinity graph fusing geographic distance and visual similarity online at each epoch, and focuses on the hardest samples via greedy weighted clique expansion. With the DINOv2 backbone frozen and only 1.96M parameters trained, SAGE achieves state-of-the-art results across 8 benchmarks.

Scalable Multi-Task Low-Rank Model Adaptation

This paper systematically analyzes the root causes of multi-task LoRA collapse as the number of tasks scales (uniform regularization destroying shared knowledge + component-level LoRA amplifying gradient conflicts), and proposes mtLoRA: spectral-aware regularization + block-level adaptation + fine-grained routing. The method outperforms the state of the art by an average of 2.3% on 15–25 tasks, while reducing parameters by 47% and training time by 24%.

Stop Wasting Your Tokens: Towards Efficient Runtime Multi-Agent Systems

This paper proposes SupervisorAgent, a lightweight real-time adaptive supervision framework that actively intervenes at critical interaction nodes (error correction, guidance provision, observation purification) via an LLM-free adaptive filter, reducing token consumption of Smolagent on the GAIA benchmark by 29.68% without sacrificing success rate.

When Agents "Misremember" Collectively: Exploring the Mandela Effect in LLM-based Multi-Agent Systems

This paper presents the first systematic study of the Mandela effect (collective false memory) in LLM-based multi-agent systems. It introduces the ManBench benchmark (4,838 questions, 5 interaction protocols), demonstrates that all 13 evaluated LLMs are susceptible to this effect, and proposes prompt-level and model-level mitigation strategies that reduce false memory by 74.40% on average.