⚡ LLM Efficiency¶

📹 ICCV2025 · 1 paper notes

MixANT: Observation-dependent Memory Propagation for Stochastic Dense Action Anticipation: This paper proposes MixANT, which introduces input-dependence into the forgetting gate (A matrix) of Mamba via a Mixture-of-Experts approach. A lightweight router dynamically selects context-aware A matrices to control temporal memory propagation, achieving state-of-the-art performance across all three dense action anticipation benchmarks: 50Salads, Breakfast, and Assembly101.