Skip to content

🎬 Video Generation

💬 ACL2026 · 4 paper notes

📌 Same area in other venues: 📷 CVPR2026 (152) · 🔬 ICLR2026 (98) · 🧪 ICML2026 (32) · 🤖 AAAI2026 (11) · 🧠 NeurIPS2025 (23) · 📹 ICCV2025 (49)

🔥 Top topics: Video Generation ×3

Accelerating Training of Autoregressive Video Generation Models via Local Optimization with Representation Continuity

The authors propose the Local Optimization + Representation Continuity (ReCo) training strategy. By optimizing within local windows and constraining smooth transitions of hidden states, they achieve a 2x acceleration in training autoregressive video generation models without sacrificing generation quality.

OSCBench: Benchmarking Object State Change in Text-to-Video Generation

The authors propose OSCBench—the first benchmark specifically designed to evaluate Object State Change (OSC) capabilities in text-to-video (T2V) models. Built on cooking scenarios with 1,120 prompts covering Regular, Novel, and Compositional scenarios, the benchmark reveals that even the strongest T2V models achieve an OSC accuracy of only 0.786.

Self-Correcting Text-to-Video Generation with Misalignment Detection and Localized Refinement

VideoRepair is introduced as the first training-free, model-agnostic self-correction framework for text-to-video generation. It utilizes MLLMs to detect fine-grained text-video misalignments, preserving correct regions while selectively refining problematic ones. It consistently improves alignment quality across four different T2V backbone models on EvalCrafter and T2V-CompBench.

TeachMaster: Generative Teaching via Code

TeachMaster proposes the Generative Teaching paradigm, using code as an interpretable intermediate representation for educational videos. It employs collaborating agents for planning, code generation, narration, debugging, synchronization, and layout to produce full-course videos, achieving near-human quality while reducing the production cost of a 45-hour course to approximately 0.3% of traditional methods.