Skip to content

👥 Social Computing

📹 ICCV2025 · 4 paper notes

Gradient Extrapolation for Debiased Representation Learning

This paper proposes GERNE, a method that constructs two batches with different degrees of spurious correlation and performs linear extrapolation on their gradients to guide the model toward learning debiased representations, outperforming state-of-the-art methods under both known and unknown attribute settings.

Learning Visual Proxy for Compositional Zero-Shot Learning

This paper proposes the concept of Visual Proxy — text-guided visual class centers introduced into CZSL for the first time — and jointly optimizes textual prototypes and visual proxies via Cross-Modal Joint Learning (CMJL), achieving closed-world SOTA on four CZSL benchmarks.

No More Sibling Rivalry: Debiasing Human-Object Interaction Detection

This paper identifies and systematically analyzes the "Toxic Siblings Bias" in HOI detection—highly similar HOI triplets that mutually interfere and compete at both the input and output levels. Two debiasing learning objectives are proposed: Contrastive-then-Calibration (C2C) and Merge-then-Split (M2S), achieving +9.18% mAP over the baseline and +3.59% over the previous state-of-the-art on HICO-DET.

PropVG: End-to-End Proposal-Driven Visual Grounding with Multi-Granularity Discrimination

This paper proposes PropVG, the first end-to-end proposal-based visual grounding framework that eliminates the need for pretrained detectors. It decomposes visual grounding into two stages — foreground proposal generation and contrastive learning-based referring scoring — and introduces a Multi-granularity Target Discrimination (MTD) module that integrates object-level and semantic-level information to determine target existence. PropVG achieves state-of-the-art performance on 10 datasets while running 4× faster than traditional proposal-based methods.