📦 Model Compression¶

🎞️ ECCV2024 · 24 paper notes

📌 Same area in other venues: 📷 CVPR2026 (108) · 🔬 ICLR2026 (240) · 💬 ACL2026 (59) · 🧪 ICML2026 (117) · 🤖 AAAI2026 (60) · 🧠 NeurIPS2025 (143)

🔥 Top topics: Compression ×6 · Model Compression ×6 · Knowledge Distillation ×2

A Simple Low-bit Quantization Framework for Video Snapshot Compressive Imaging: The first low-bit quantization framework, Q-SCI, specifically designed for Video Snapshot Compressive Imaging (Video SCI) reconstruction. By incorporating a high-quality feature extraction module, a precise video reconstruction module, and query/key distribution shift calibration in the Transformer branch, it achieves a 7.8x theoretical speedup with only a 2.3% performance drop under 4-bit quantization.
AdaLog: Post-Training Quantization for Vision Transformers with Adaptive Logarithm Quantizer: This paper proposes AdaLog, an adaptive logarithmic base quantizer that addresses the power-law distribution of post-Softmax and post-GELU activations in ViTs by replacing fixed \(\log_2\)/\(\log_{\sqrt{2}}\) quantizers with a searchable logarithmic base. Additionally, a Fast Progressive Combinatorial Search (FPCS) strategy is designed to efficiently determine quantization hyperparameters, which significantly outperforms existing ViT PTQ methods under ultra-low bit (3/4-bit) configurations.
Adaptive Compressed Sensing with Diffusion-Based Posterior Sampling: This paper proposes AdaSense, which leverages the zero-shot posterior sampling capability of pre-trained diffusion models to quantify reconstruction uncertainty, thereby adaptively selecting the optimal measurement matrix. It achieves training-free adaptive compressed sensation across multiple domains including face images, MRI, and CT, outperforming non-adaptive methods and even the optimal PCA-based non-adaptive scheme.
Adaptive Selection of Sampling-Reconstruction in Fourier Compressed Sensing: This paper proposes the "adaptive selection of sampling-reconstruction pairs" (\(\mathcal{H}_{1.5}\)) framework. It leverages a super-resolution spatial generative model to quantify high-frequency Bayesian uncertainty and selects the optimal sampling mask-reconstruction network pair for each input data. Theoretically and experimentally, it outperforms both non-adaptive joint optimization (\(\mathcal{H}_1\)) and adaptive sampling (\(\mathcal{H}_2\)), achieving significant SSIM improvements in face image and multi-coil MRI reconstruction.
Adversarially Robust Distillation by Reducing the Student-Teacher Variance Gap: This paper proposes an adversarially robust knowledge distillation method based on feature distribution statistical alignment. By reducing the feature variance gap between adversarial and clean examples in the student and teacher models, the adversarial robustness of the student model is enhanced. It is discovered that robust accuracy exhibits a strong negative linear correlation with the variance gap.
Anytime Continual Learning for Open Vocabulary Classification: The AnytimeCL framework is proposed to achieve open-vocabulary continual learning, allowing the model to receive samples at any time and perform inference on arbitrary label sets. This is realized by partially fine-tuning the final transformer block of CLIP and dynamically fusing predictions from both the fine-tuned and original models.
Auto-DAS: Automated Proxy Discovery for Training-free Distillation-aware Architecture Search: This paper proposes Auto-DAS, an automated proxy discovery framework based on evolutionary algorithms for training-free distillation-aware architecture search (DAS). By automatically discovering optimal proxy metrics within a search space composed of student intrinsic statistics and teacher-student interaction statistics, it bypasses the limitations of hand-crafted proxies. Auto-DAS achieves SOTA ranking correlations and search accuracies across various architectures and search spaces, including ResNet, ViT, and NAS-Bench-101/201.
BaSIC: BayesNet Structure Learning for Computational Scalable Neural Image Compression: This paper proposes the BaSIC framework, which simultaneously controls backbone network complexity and the parallel computation capability of autoregressive units by learning the Bayesian network structure of neural image compression (NIC) systems, achieving computational scalability control over the entire NIC pipeline for the first time.
Bidirectional Stereo Image Compression with Cross-Dimensional Entropy Model: Proposes a bidirectionally symmetric stereo image compression framework, BiSIC, using a 3D convolutional joint codec and a cross-dimensional entropy model. It outperforms both traditional standards and existing learned methods on PSNR and MS-SSIM, while eliminating the reconstruction quality imbalance between the left and right views inherent in unidirectional approaches.
Category Adaptation Meets Projected Distillation in Generalized Continual Category Discovery: Proposes the CAMP method, which significantly improves the balance between learning new categories and retaining old knowledge in Generalized Continual Category Discovery (GCCD) scenarios through the cooperative combination of learnable projector distillation and category prototype adaptation networks.
ELSE: Efficient Deep Neural Network Inference through Line-based Sparsity Exploration: This work proposes ELSE, an event suppression method through line-based sparsity exploration, which utilizes the spatial correlation of adjacent lines in feature maps to reduce the count of non-zero activations (events), achieving \(3.14\sim6.49\times\) computational savings on object detection and pose estimation tasks while staying complementary to existing event suppression methods.
Improving Knowledge Distillation via Regularizing Feature Direction and Norm: A novel ND loss function is proposed. By simultaneously aligning the feature direction of the student to the class mean direction of the teacher and encouraging the student to generate large-norm features, it significantly improves the performance of existing knowledge distillation methods on ImageNet, CIFAR100, and COCO.
Improving Zero-Shot Generalization for CLIP with Variational Adapter: A Prompt-based Variational Adapter (PVA) is proposed to separate base and novel category samples in the latent space through a variational adapter. A divide-and-conquer strategy is adopted to process them separately, combined with residual connections to enhance the transfer capability of novel categories, achieving state-of-the-art performance on generalized zero-shot learning and cross-dataset transfer learning benchmarks.
Is Retain Set All You Need in Machine Unlearning? Restoring Performance of Unlearned Models with Out-Of-Distribution Images: Proposes SCAR (Selective-distillation for Class and Architecture-agnostic unleaRning), a retain-set-free approximate unlearning algorithm that guides the feature vectors of forgotten samples toward the nearest incorrect class distribution via Mahalanobis distance, and utilizes OOD image distillation to maintain model performance.
Isomorphic Pruning for Vision Models: This paper proposes Isomorphic Pruning, which models network sub-structures as graphs and groups them by graph isomorphism. By ranking and pruning independently within each isomorphic group, it addresses the problem of incomparable importance among heterogeneous sub-structures, outperforming specially designed pruning methods on both ViTs and CNNs.
Leveraging Hierarchical Feature Sharing for Efficient Dataset Condensation: This paper proposes a Hierarchical Memory Network (HMN) that stores synthetic data for dataset distillation in a three-tier structure (dataset-level, class-level, and instance-level memory). It improves storage efficiency through hierarchical feature sharing and further removes redundancy via instance-level pruning, surpassing all baseline methods using only a low-GPU-memory batch-based loss.
MetaAug: Meta-Data Augmentation for Post-Training Quantization: This paper proposes MetaAug, a meta-learning-based post-training quantization (PTQ) method. It employs a learnable transformation network to augment calibration data and concurrently optimizes both the transformation network and the quantized model within a bi-level optimization framework, thereby effectively mitigating the overfitting of PTQ on small calibration sets.
PaPr: Training-Free One-Step Patch Pruning with Lightweight ConvNets for Faster Inference: This paper proposes PaPr, which leverages convolutional feature maps from lightweight ConvNets to generate Patch Significance Maps (PSMs). It performs one-step patch pruning on ViT/ConvNet/hybrid architectures without any retraining, achieving significant computation reductions (up to 3.7× FLOPs reduction in video scenarios) with minimal loss in accuracy.
PQ-SAM: Post-training Quantization for Segment Anything Model: This paper proposes PQ-SAM, the first post-training quantization method tailored for the Segment Anything Model. It addresses SAM's highly asymmetric activation distributions and detrimental outliers through Grouped Activation Distribution Transformation (GADT) and a two-stage Outlier Hierarchical Clustering (OHC) scheme, pushing 4-bit quantized SAM to a practical level.
Simple Unsupervised Knowledge Distillation With Space Similarity: CoSS proposes that in unsupervised knowledge distillation, in addition to the conventional feature-dimension cosine similarity, an additional space-dimension cosine similarity (Space Similarity) loss is introduced. By transposing the feature matrix and aligning it along the dimension direction, this loss compensates for the loss of manifold structure information caused by \(L_2\) normalization, achieving SOTA on multiple UKD benchmarks in a minimalist manner.
SpaceJAM: a Lightweight and Regularization-free Method for Fast Joint Alignment of Images: SpaceJAM is proposed as an unsupervised joint image alignment method with only approximately 16K trainable parameters. It requires neither regularization terms nor atlas maintenance, matching the alignment capabilities of existing methods on the SPair-71K and CUB datasets while achieving a speedup of over 10x.
Token Compensator: Altering Inference Cost of Vision Transformer without Re-Tuning: This paper proposes ToCom (Token Compensator), a lightweight plug-in for model arithmetic frameworks. Acquired via rapid parameter-efficient self-distillation, ToCom can be directly inserted into any pre-trained downstream models during inference to compensate for the performance loss caused by token compression rate mismatch, without requiring re-training.
Uncertainty-Driven Spectral Compressive Imaging with Spatial-Frequency Transformer: This paper proposes Specformer, which fully captures the spatial sparsity and inter-spectral similarity priors of hyperspectral images (HSIs) through parallel local window self-attention (LWSA) and frequency-domain self-attention (FWSA) modules. It also introduces an uncertainty-driven loss function to enhance the network's reconstruction capability for texture-rich and boundary regions, outperforming the state-of-the-art (SOTA) with lower computational cost on both simulated and real-world HSI datasets.
UNIC: Universal Classification Models via Multi-teacher Distillation: This paper proposes the UNIC framework, which integrates knowledge from multiple complementary pre-trained models into a single student model through improved multi-teacher distillation strategies (including a ladder of projectors and teacher dropping techniques), achieving cross-task universal classification.