Skip to content

👥 Social Computing

📷 CVPR2026 · 3 paper notes

📌 Same area in other venues: 🔬 ICLR2026 (17) · 💬 ACL2026 (45) · 🧪 ICML2026 (9) · 🤖 AAAI2026 (10) · 🧠 NeurIPS2025 (20) · 📹 ICCV2025 (4)

Bridging Pixels and Words: Mask-Aware Local Semantic Fusion for Multimodal Media Verification

The MaLSF framework is proposed, utilizing mask-label pairs as semantic anchors to achieve active local semantic conflict detection through Bidirectional Cross-modal Verification (BCV) and Hierarchical Semantic Aggregation (HSA) modules, achieving SOTA on DGM4 and fake news detection tasks.

Instance-level Visual Active Tracking with Occlusion-Aware Planning

OA-VAT constructs discriminative "instance prototypes" offline from a single reference image to resist similar distractors. It utilizes online EMA-enhanced prototypes and confidence-adaptive Kalman filtering to maintain stable tracking, while training a target-box-conditioned diffusion trajectory planner to actively bypass obstacles and recover the target upon occlusion—achieving an average SR of 0.93 on UnrealCV, 90.8% average CAR on real images, and 81.6% TSR on real UAVs, reaching 35 FPS on an RTX 3090.

Revisiting Unknowns: Towards Effective and Efficient Open-Set Active Learning

Ours proposes E2OAL, a detector-free open-set active learning framework that discovers latent structures of unknown classes via label-guided clustering, jointly models known and unknown categories using a Dirichlet-calibrated auxiliary head, and designs a two-stage adaptive querying strategy to simultaneously achieve high accuracy, high query purity, and high training efficiency across multiple benchmarks.