Skip to content

📡 Signal & Communications

📷 CVPR2026 · 5 paper notes

AcTTA: Rethinking Test-Time Adaptation via Dynamic Activation

This paper proposes AcTTA, a test-time adaptation framework based on dynamic activation function modulation. By reparameterizing conventional fixed activation functions into a learnable form—incorporating an activation center shift and asymmetric gradient slopes—AcTTA adaptively adjusts activation behavior during inference to address distribution shift, consistently outperforming normalization-layer-based TTA methods on CIFAR10-C, CIFAR100-C, and ImageNet-C.

ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding

This paper presents ChartNet, a million-scale chart understanding dataset comprising 1.5 million high-quality multimodal aligned samples. Generated through a code-guided synthesis pipeline, the dataset covers 24 chart types and 6 plotting libraries, with each sample organized as a quintuple (code, image, data table, text description, QA with reasoning). A 2B model fine-tuned on ChartNet surpasses GPT-4o and 72B open-source models.

CLAY: Conditional Visual Similarity Modulation in Vision-Language Embedding Space

CLAY proposes a training-free conditional visual similarity computation method that modulates similarity by constructing text-conditioned subspaces within the VLM embedding space. It adapts to varying retrieval conditions without recomputing database features and supports multi-condition retrieval.

Dual-Imbalance Continual Learning for Real-World Food Recognition

This paper proposes DIME, a framework that employs class-count-aware spectral adapter merging and rank-wise threshold modulation to address dual imbalance (intra-step class long-tail distribution and inter-step class-count skew) in continual learning, consistently outperforming baselines by over 3% on four long-tail food recognition benchmarks.

FAAR: Efficient Frequency-Aware Multi-Task Fine-Tuning via Automatic Rank Selection

This paper proposes FAAR, a frequency-aware parameter-efficient fine-tuning method for multi-task learning. It introduces Performance-Driven Rank Shrinking (PDRS) to dynamically select the optimal rank per task and per layer, and designs a Task-Spectral Pyramidal Decoder (TS-PD) that leverages FFT frequency information to enhance spatial awareness and cross-task consistency. FAAR achieves superior performance using only 1/9 the parameters of full fine-tuning.