Skip to content

🎮 Reinforcement Learning

🎞️ ECCV2024 · 3 paper notes

📌 Same area in other venues: 📷 CVPR2026 (25) · 🔬 ICLR2026 (400) · 💬 ACL2026 (46) · 🧪 ICML2026 (110) · 🤖 AAAI2026 (58) · 🧠 NeurIPS2025 (143)

AdaGlimpse: Active Visual Exploration with Arbitrary Glimpse Position and Scale

This paper proposes AdaGlimpse, which utilizes Soft Actor-Critic (SAC) reinforcement learning to select glimpses of arbitrary positions and scales from a continuous action space. Combined with a ViT encoder equipped with elastic positional encoding, it achieves multi-task active visual exploration (reconstruction, classification, and segmentation), outperforming state-of-the-art methods that use 18% of pixels, while requiring only 6% of pixels.

Octopus: Embodied Vision-Language Programmer from Environmental Feedback

This paper proposes Octopus, an embodied vision-language programming model that bridges high-level planning and low-level manipulation by generating executable code. It introduces a Reinforcement Learning with Environmental Feedback (RLEF) training scheme to enhance decision-making quality.

Visual Grounding for Object-Level Generalization in Reinforcement Learning

This paper leverages the visual grounding capability of a vision-language model (MineCLIP) to generate confidence maps of target objects. VLM knowledge is transferred to reinforcement learning through two pathways—reward design and task representation—enabling zero-shot generalization to unseen objects and instructions.