Skip to content

🧑 Human Understanding

💬 ACL2025 · 2 paper notes

📌 Same area in other venues: 📷 CVPR2026 (151) · 🔬 ICLR2026 (45) · 🧪 ICML2026 (5) · 🤖 AAAI2026 (20) · 🧠 NeurIPS2025 (21) · 📹 ICCV2025 (41)

🔥 Top topics: Face & Gaze ×2

Beyond Surface Simplicity: Revealing Hidden Reasoning Attributes for Precise Commonsense Diagnosis

This paper reveals that commonsense reasoning benchmarks contain issues that appear simple on the surface but actually imply complex hidden reasoning attributes, and proposes a fine-grained diagnostic framework based on hidden reasoning attributes, enabling a more precise analysis and evaluation of models' commonsense reasoning capabilities.

TransBench: Breaking Barriers for Transferable Graphical User Interface Agents in Dynamic Digital Environments

This paper proposes TransBench, the first benchmark to systematically evaluate the transferability (cross-version/cross-platform/cross-app) of GUI Agents. It covers 81 Chinese Apps, 1459 screenshots, and 22K+ annotated instructions. Experiments show that fine-tuning on older versions can effectively transfer to new versions and other platforms, with Android data exhibiting the strongest generalization in cross-platform migration.