Skip to content

SiNR: Sparsity Driven Compressed Implicit Neural Representations

Conference: CVPR 2025
arXiv: 2503.19576
Code: https://dsgrad.github.io/SINR
Area: 3D Vision
Keywords: Implicit Neural Representation, signal compression, sparse coding, compressed sensing, INR

TL;DR

Discovering the key property that the weight space of INRs exhibits an approximate Gaussian distribution, this work employs a random sensing matrix based on compressed sensing theory to transform weight vectors into high-dimensional sparse codes. This achieves a fundamental INR compression that does not rely on quantization schemes and can be seamlessly combined with any existing INR compression method.

Background & Motivation

Background: Implicit Neural Representations (INRs) map coordinates to signal values (such as pixel colors or occupancy values) using MLPs, representing a unified cross-modal data representation. Existing INR compression methods, such as COIN, COIN++, and INRIC, primarily adopt two strategies: directly performing quantization and entropy coding on trained INRs, or deriving and compressing latent codes on top of INRs via learnable transformations.

Limitations of Prior Work: The performance of existing methods heavily relies on the choice of quantization schemes and entropy coding, and none of them attempt to fundamentally compress the INR itself before quantization and entropy coding by discovering and exploiting the inherent compressibility in the INR weight space.

Key Challenge: Natural signals are inherently sparse in transform domains (such as the DCT domain), and INRs are essentially domain transforms that encode signals into MLP weights. Given that signals are compressible, the corresponding weight space should also possess exploitable compressibility. However, how can this hidden sparsity be discovered?

Goal: To find a fundamental INR compression method prior to quantization and entropy coding that can be used independently of specific compression pipelines.

Key Insight: The authors observe a key phenomenon where the weight vectors of trained INRs approximately follow a Gaussian distribution, a pattern that holds across different modalities such as images, occupancy fields, and NeRFs. According to the Central Limit Theorem (CLT), a Gaussian-distributed random variable can be generated by a finite linear combination of arbitrary random variables. This implies a random matrix can be used as a sensing matrix to discover a sparse representation.

Core Idea: Decomposing each weight vector \(\mathbf{w}\) as \(\mathbf{w} = \mathbf{A}\mathbf{x}\), where \(\mathbf{A}\) is a random Gaussian sensing matrix (controlled by a seed) and \(\mathbf{x}\) is a high-dimensional sparse vector. Since \(\mathbf{A}\) can be reconstructed by the seed, only the non-zero values and indices of \(\mathbf{x}\) need to be transmitted to restore the original weights.

Method

Overall Architecture

SINR is inserted as a preprocessing module into the INR compression pipeline: after training the INR, SINR first converts the weight vectors into sparse codes (storing only non-zero values and indices), followed by quantization and entropy coding. On the decoder side, the sensing matrix is reconstructed using the seed, original weights are recovered from the sparse codes, and normal inference is conducted.

Key Designs

  1. Weight Space Gaussian Distribution Discovery:

    • Function: Providing a theoretical foundation for using compressed sensing.
    • Mechanism: Through experimental validation, the weight distributions of INR hidden layers converge to a Gaussian distribution across various modalities like images, occupancy fields, and NeRFs. This does not depend on specific activation functions (Sinusoidal, Gaussian, WIRE, etc.) and is an inherent characteristic of INRs.
    • Design Motivation: The Gaussian distribution property allows direct application of the CLT—since \(w_i = \sum_j A_{ij} x_j\) can yield a Gaussian-distributed \(w_i\), there exists a sparse \(\mathbf{x}\) such that \(\mathbf{w} = \mathbf{A}\mathbf{x}\) holds.
  2. Sparse Coding via Random Sensing Matrix:

    • Function: Compressing a \(k_1\)-dimensional weight vector into \(2s\) elements (\(s\) non-zero values + \(s\) indices).
    • Mechanism: Given a weight vector \(\mathbf{w} \in \mathbb{R}^{k_1}\), a random sensing matrix \(\mathbf{A} \in \mathbb{R}^{k_1 \times k_2}\) (\(k_2 > k_1\)) is constructed. An \(L_1\)-minimization problem is solved via the OMP algorithm: \(\min \|\mathbf{x}\|_1\) s.t. \(\mathbf{w} = \mathbf{A}\mathbf{x}\), constraining \(2s < k_1\) to guarantee compression. The sensing matrix \(\mathbf{A}\) is controlled by a random seed; sender and receiver only need to share the seed, completely avoiding the transmission of the matrix itself.
    • Design Motivation: Traditional dictionary learning requires learning/transmitting dictionary \(\mathbf{A}\), whereas the CLT demonstrates that a random matrix is sufficient. This completely eliminates dependency on dictionaries, acting as the core theoretical contribution of this work.
  3. Special Treatment for Tiny INRs:

    • Function: Handling small-scale INRs with hidden dimension \(k < 50\).
    • Mechanism: When \(k\) is extremely small, achieving an effective compression that satisfies \(2s < k\) becomes challenging. The solution is to flatten the \(k \times k\) weight matrix into a \(k^2 \times 1\) vector. Since \(k^2 \gg k\), carrying out sparse coding in this higher-dimensional space yields better performance.
    • Design Motivation: Ensuring that SINR is applicable to INRs of various scales, including lightweight models with very few parameters.

Loss & Training

SINR itself does not involve training and functions as a post-processing algorithm. INR training uses standard signal reconstruction loss (such as MSE). SINR is applied after training, using the OMP (Orthogonal Matching Pursuit) algorithm to solve the L1 minimization problem. Quantization utilizes a 16-bit uniform quantizer (65,536 levels), and entropy coding uses Brotli coding. The choice of sparsity \(s\) does not depend on the specific signal content, but solely on the number of hidden layer neurons.

Key Experimental Results

Main Results

Image coding (KODAK dataset):

Config (h,m) Method bpp required for ~30dB PSNR
(3,128) COIN (baseline) ~3.7 bpp
(3,128) INRIC ~2.0 bpp
(3,128) SINR ~1.7 bpp

Integration with COIN++: To reach ~24.2dB PSNR, COIN++ requires >1.5 bpp, whereas adding SINR reduces this to <1.0 bpp.

Ablation Study

Experiment Key Findings
C1: Varying number of neurons (32 \(\to\) 128) More neurons lead to larger compression gains with SINR
C2: Varying number of layers (3 \(\to\) 7) with fixed 64 Compression gains are relatively limited as the number of layers increases
C3/C4: Meta-learning SINR is equally effective under meta-learning frameworks
C5: COIN++ Can compress modulation parameters
Occupancy fields SINR achieves the smallest file size and the highest IoU

Key Findings

  • The more hidden layer neurons there are, the more robust the signal representation learned by the INR, and the stronger the hidden compressibility in the weight space—meaning larger models paradoxically achieve higher compression ratios with SINR.
  • The optimal sparsity \(s\) for SINR does not depend on the specific signal content, but only on the dimension of the hidden layers, making parameter selection straightforward.
  • The occupancy field data modality is more compressible than images, showing that redundant patterns of different signal modalities within the INR weight space vary.
  • SINR is compatible with COIN++, showing the capability to compress both base network parameters and modulation parameters simultaneously.

Highlights & Insights

  • The discovery of the Gaussian distribution in weight space is a "first-principles" level of insight. After mapping signals to weights in an INR, the weights exhibit a Gaussian distribution as an inevitable outcome of the CLT, establishing a solid theoretical foundation for the entire compression paradigm.
  • No learning or transmission required for the sensing matrix is an extremely elegant design. Leveraging the CLT mathematically proves that a random matrix is sufficient and only requires sharing a seed, which completely eliminates the overhead of transmitting dictionaries.
  • Modality-agnostic general compression: The same method is effective across images, 3D occupancy fields, and NeRFs, showcasing the true strength of INRs as unified data representations.

Limitations & Future Work

  • The OMP algorithm has a relatively high computational overhead at high dimensions, which limits real-time compression for large-scale INRs.
  • The possibility of incorporating sparsity constraints during training is not discussed; joint training and compression might yield better rate-distortion performance.
  • Simple schemes are still used for quantization and entropy coding; integrating more advanced learnable quantization could potentially further improve results.
  • Bias vectors are left uncompressed (owing to their small size), but they may be worth considering in extreme compression scenarios.
  • vs COIN/COIN++: These methods directly quantize INR parameters or their modulations without exploring the internal compressibility of the weight space. SINR inserts a fundamental compression step prior to them, making them complementary.
  • vs INRIC: INRIC leverages meta-learning to improve INR generalization and thereby indirectly enhance compression, whereas SINR directly targets and exposes weight redundancy. SINR can be directly stacked with INRIC to achieve compounding benefits.
  • vs SHACIRA: SHACIRA uses feature grid reparameterization for compression, which is an architecture-level design. SINR serves as a general weight-level post-processing method, independent of the network architecture.

Rating

  • Novelty: ⭐⭐⭐⭐⭐ The chain of reasoning from Gaussian weight distribution to CLT and then to random sensing matrices is highly elegant.
  • Experimental Thoroughness: ⭐⭐⭐⭐ Covers multiple modalities (images, occupancy fields, NeRFs) with various configurations.
  • Writing Quality: ⭐⭐⭐⭐ The theoretical derivation is clear, and the motivation is well-established.
  • Value: ⭐⭐⭐⭐ Shows strong practicality as a general preprocessing module for INR compression.