👥 Social Computing¶
📷 CVPR2026 · 5 paper notes
- As Language Models Scale, Low-order Linear Depth Dynamics Emerge
-
This paper treats the layer depth of a Transformer as a discrete-time system, demonstrating that the inter-layer propagation and intervention response of GPT-2 can be approximated near a given context by a 32-dimensional low-order linear state-space surrogate. Notably, as model scale increases, this surrogate becomes more accurate. The framework further enables the derivation of energy-efficient multi-layer intervention strategies that outperform heuristic injection baselines.
- As Language Models Scale, Low-order Linear Depth Dynamics Emerge
-
This work treats the layer-wise forward pass of a Transformer as a discrete-time dynamical system and constructs a 32-dimensional low-order linear layer variant (LLV) surrogate to approximate the depth propagation dynamics of the last-token hidden state. The surrogate achieves a Spearman correlation of 0.995 in predicting per-layer intervention gains on GPT-2-large, and this linear identifiability monotonically increases with model scale (GPT-2 → medium → large). The closed-form optimal solution of the surrogate is further exploited to derive multi-layer activation steering schemes that require 2–5× less energy than heuristic intervention strategies.
- Bridging Pixels and Words: Mask-Aware Local Semantic Fusion for Multimodal Media Verification
-
This paper proposes the MaLSF framework, which employs mask-label pairs as semantic anchors and introduces a Bidirectional Cross-modal Verification (BCV) module and a Hierarchical Semantic Aggregation (HSA) module to enable active local semantic conflict detection, achieving state-of-the-art performance on the DGM4 benchmark and fake news detection tasks.
- Learning from Synthetic Data via Provenance-Based Input Gradient Guidance
-
This paper proposes leveraging provenance information—automatically obtained during the synthetic data generation process—as auxiliary supervision signals. By applying input gradient guidance (suppressing input gradients in non-target regions), the method directly encourages models to learn discriminative representations focused on target regions. Effectiveness is validated across multiple tasks and modalities, including weakly supervised localization, spatio-temporal action detection, and image classification.
- Revisiting Unknowns: Towards Effective and Efficient Open-Set Active Learning
-
This paper proposes E2OAL, a detector-free open-set active learning framework that discovers latent structures among unknown classes via label-guided clustering, jointly models known and unknown categories through a Dirichlet calibration auxiliary head, and introduces a two-stage adaptive querying strategy. E2OAL simultaneously achieves high accuracy, high query purity, and high training efficiency across multiple benchmarks.