Skip to content

📡 Signal & Communications

📷 CVPR2025 · 5 paper notes

📌 Same area in other venues: 📷 CVPR2026 (2) · 🔬 ICLR2026 (8) · 🧪 ICML2026 (2) · 🤖 AAAI2026 (3) · 🧠 NeurIPS2025 (5) · 📹 ICCV2025 (3)

ABC-Former: Auxiliary Bimodal Cross-domain Transformer with Interactive Channel Attention

This paper proposes ABC-Former, which introduces CIELab color space and RGB histograms as auxiliary bimodal information. It utilizes a cross-domain Transformer and an Interactive Channel Attention (ICA) module to achieve cross-modal transfer of global color knowledge, achieving SOTA performance in sRGB white balance correction tasks. It is also extended to ABC-FormerM to handle mixed illumination scenarios.

Breaking the Low-Rank Dilemma of Linear Attention

This paper theoretically reveals that the fundamental cause of linear attention's performance lagging behind Softmax attention is the low-rank bottleneck of output features. It proposes Rank-Augmented Linear Attention (RALA), which utilizes two complementary strategies—enhancing KV buffer rank and output feature rank—to match or even surpass the performance of Softmax attention while maintaining linear complexity.

Continuous Space-Time Video Resampling with Invertible Motion Steganography

An Invertible Motion Steganography Module (IMSM) is proposed to embed motion information into low-frame-rate frames during video temporal downsampling, and accurately restore motion details via inverse transformation during upsampling. It supports continuous (non-integer) space-time resampling factors, significantly improving reconstruction quality while preserving the visual quality of downsampled frames.

DiTASK: Multi-Task Fine-Tuning with Diffeomorphic Transformations

Proposes DiTASK, which utilizes continuous piecewise-affine (CPAB) diffeomorphic transformations to smoothly transform the singular values of pretrained weight matrices while keeping the singular vectors unchanged. It achieves full-rank update multi-task fine-tuning with only about 32 parameters per layer, outperforming MTLoRA by 26.27% relative improvement with 75% fewer parameters on PASCAL MTL.

Neural Video Compression with Context Modulation

Proposed the DCMVC framework, which modulates temporal context in two steps: flow orientation and context compensation. By fully utilizing reference information in both the pixel domain and the feature domain, it achieves compression performance that saves an average of 22.7% bitrate compared to H.266/VVC and 10.1% bitrate compared to the previous SOTA, DCVC-FM.