CVPR 2026 3D Vision Tensor Ring Decomposition Implicit Neural Representation Reparameterization Image Inpainting Point Cloud Recovery Frequency Analysis

Reparameterized Tensor Ring Functional Decomposition for Multi-Dimensional Data Recovery¶

Conference: CVPR 2026 arXiv: 2603.01034 Code: YangyangXu2002/RepTRFD Area: 3D Vision / Low-Level Vision / Tensor Decomposition Keywords: Tensor Ring Decomposition, Implicit Neural Representation, Reparameterization, Image Inpainting, Point Cloud Recovery, Frequency Analysis

TL;DR¶

This paper proposes RepTRFD, which reparameterizes Tensor Ring factors into the form of "learnable latent tensor × fixed basis" to address the spectral bias problem inherent in INR-parameterized TR factors, achieving state-of-the-art performance across image inpainting, denoising, super-resolution, and point cloud recovery tasks.

Background & Motivation¶

Low-rank tensor decompositions are widely applicable: CP, Tucker, TT, and TR decompositions provide compact representations for multi-dimensional data such as images, videos, remote sensing imagery, and medical imaging. Among these, TR decomposition is particularly efficient for high-order tensor modeling due to its ring-structured topology.

Discrete TR is limited to fixed grids: Conventional TR decomposition is inherently discrete, defined only on fixed meshgrids, and cannot handle continuous signals or resolution-agnostic modeling scenarios (e.g., sparse point clouds).

INR enables functional tensor decomposition: Prior work has extended Tucker/CP/TT decompositions to the continuous domain (e.g., LRTFR, DRO-TFF), but the continuous generalization of TR decomposition remains unexplored.

Directly applying INR to TR factors is ineffective: When INR is directly used to parameterize TR factors, reconstructions are dominated by low-frequency components, with severe loss of high-frequency detail.

Frequency-domain analysis reveals the root cause: The authors theoretically demonstrate that the spectral characteristics of TR factors are directly propagated to the reconstructed tensor — if a factor lacks high-frequency components, the reconstruction will correspondingly lack high frequencies along that dimension.

Spectral bias of INR is the bottleneck: Standard INRs (e.g., SIREN) tend to learn low-frequency components and struggle to capture the high-frequency content required in TR factors, necessitating a new strategy to overcome this limitation.

Method¶

Overall Architecture¶

RepTRFD consists of three core components:

Shared frequency embedding layer: For each dimensional coordinate \(v_k\), a single sinusoidal layer produces an embedding \(\mathbf{z}_k = \sin(\omega_0(\mathbf{w}v_k + \mathbf{b}))\), with parameters shared across all modes to enhance cross-mode consistency.
Branch MLP networks: Each mode has a dedicated MLP \(f_{\theta_k}\) that maps the embedding to a latent tensor slice \(\mathcal{C}^{(k)}_{:v_k:} \in \mathbb{R}^{r_k \times R_{k+1}}\), where \(R_{k+1} = \beta r_{k+1}\) (\(\beta \geq 1\) is an expansion factor).
Reparameterization and TR contraction: The TR factor is obtained via \(\mathcal{G}^{(k)} = \mathcal{C}^{(k)} \times_3 \mathbf{B}^{(k)}\) (where \(\mathbf{B}^{(k)}\) is a fixed basis), and the target tensor entry is reconstructed through the trace operation of TR contraction \(\Phi(\cdot)\).

Key Designs: Reparameterization Strategy¶

Core Idea: Each TR factor is decomposed into a structured combination of a learnable latent tensor \(\mathcal{C}^{(k)}\) (generated by INR) and a fixed basis \(\mathbf{B}^{(k)}\).
Theoretical Motivation (Theorem 2): It is proved that there exists a specific basis \(\mathbf{B}\) such that the gradient response ratio for high-frequency components after reparameterization is no less than that in the original parameter space, making optimization more sensitive to high-frequency details and effectively improving training dynamics.
Initialization Scheme (Theorem 3): A Xavier-style initialization is adopted, \(\mathbf{B}^{(k)}_{ij} \sim \mathcal{U}(-\sqrt{6/(r_{k+1}+R_{k+1})}, \sqrt{6/(r_{k+1}+R_{k+1})})\), ensuring consistent variance in both forward and backward passes.
Lipschitz Continuity (Theorem 4): The entire RepTRFD mapping is proved to be globally Lipschitz continuous, ensuring that the model is not overly sensitive to input perturbations.

Loss & Training¶

A general framework of a data fidelity term \(\mathsf{L}_{\text{data}}\) plus an optional regularization term \(\mathsf{L}_{\text{reg}}\) is adopted: \(\min_\phi \mathsf{L}_{\text{data}}(g_\phi; \mathcal{O}) + \mathsf{L}_{\text{reg}}(g_\phi)\). Different data terms and regularizers are selected according to the specific task (inpainting / denoising / super-resolution / point cloud recovery).

Key Experimental Results¶

Main Results¶

Image/Video Inpainting: RepTRFD consistently outperforms TRLRF, FCTN, HLRTF, LRTFR, DRO-TFF, and NeurTV on color images, multispectral images (MSI), hyperspectral images (HSI), and video data.

Dataset	Method	SR=0.1 PSNR	SR=0.2 PSNR	SR=0.3 PSNR
Color Image (256²×3)	DRO-TFF	23.22	27.52	30.04
	NeurTV	24.16	27.81	30.28
	RepTRFD	25.70	29.37	32.01
MSI (256²×31)	DRO-TFF	38.45	42.28	45.00
	RepTRFD	39.34	44.66	47.74

Point Cloud Recovery (SR=0.2, NRMSE↓):

Method	Doll	Duck	Frog	Mario
WIRE	0.106	0.060	0.053	0.086
FINER	0.110	0.059	0.054	0.088
RepTRFD	0.093	0.053	0.050	0.080

Ablation Study¶

Effect of reparameterization: At SR=0.2, removing reparameterization degrades PSNR from 30.45→27.41 (−3.04 dB) on color images and from 48.67→29.41 (−19.26 dB) on MSI, while the additional computation overhead is only approximately 1 second.
Effect of expansion factor β: Increasing β from 1 to 10 consistently improves PSNR and accelerates convergence, with diminishing returns at larger values.
Sensitivity of basis initialization: On HSI Botswana, varying the initialization scale \(a\) from 0.01 to 1 shows that the theoretically derived value \(a \approx 0.165\) achieves the best PSNR (45.27), with significant degradation when \(a\) is either too small or too large.
Shared frequency embedding: Compared to independent embeddings, the shared embedding yields more stable training and stronger resistance to overfitting.
Model complexity comparison: Under matched parameter counts and FLOPs, RepTRFD consistently outperforms LRTFR.

Key Findings¶

The gains from reparameterization are particularly pronounced on higher-order data (HSI, video), with improvements reaching 19 dB on MSI inpainting.
Computational overhead is minimal (~1s), primarily because the fixed basis \(\mathbf{B}\) does not participate in gradient updates.
On super-resolution tasks, RepTRFD runs 10–30× faster than pure INR methods (SIREN/WIRE/FINER) while achieving approximately 1 dB higher PSNR.

Highlights & Insights¶

First extension of TR decomposition to the continuous domain, filling the gap in tensor functional representation for the TR format.
Novel frequency-domain analysis perspective: The paper theoretically establishes the mechanism by which the spectrum of TR factors propagates to that of the reconstructed tensor, providing a theoretical foundation for understanding frequency bottlenecks in tensor functional representations.
Simple yet effective reparameterization strategy: Introducing a single fixed basis matrix — without modifying the network architecture or training procedure — significantly improves the model's ability to learn high-frequency components.
Comprehensive theoretical guarantees: Covering gradient dynamics (Theorem 2), initialization (Theorem 3), and Lipschitz continuity (Theorem 4), the theoretical analysis is both rigorous and practically informative.
Unified framework across four tasks: Inpainting, denoising, super-resolution, and point cloud recovery all share the same architecture, demonstrating strong generality.

Limitations & Future Work¶

Manual hyperparameter tuning is still required: TR ranks \(r_k\), expansion factor \(\beta\), and frequency \(\omega_0\) must be tuned per task; an adaptive rank selection mechanism is absent.
Validation limited to 3rd- and 4th-order tensors: Although theoretically extendable to arbitrary orders, experiments only cover 3rd-order (image/MSI/HSI) and 4th-order (video) data; higher-order scenarios (e.g., light fields, spatio-spectral data) are not evaluated.
No comparison with end-to-end deep learning methods: All baselines are traditional optimization or INR-based methods; supervised approaches such as U-Net or Transformer architectures are not considered.
Point cloud recovery tested only on small-scale models: The SHOT dataset is limited in scale; scalability on large-scale scenes (e.g., ShapeNet, real-world LiDAR point clouds) is not verified.
Fixed basis is not learned: \(\mathbf{B}^{(k)}\) is frozen after initialization, which may cap expressive capacity; learnable or adaptive bases could be explored in future work.

INR methods: SIREN (sinusoidal activations), WIRE (wavelets), FINER (variable frequencies) — the proposed method is complementary from a tensor decomposition perspective, offering faster speed with higher accuracy.
Tensor functional representations: LRTFR (continuous Tucker) and DRO-TFF (deep rank-one decomposition) — this paper is the first to introduce the TR format into this framework, and addresses spectral bias through reparameterization.
Reparameterization techniques: Weight Normalization, RepVGG structural reparameterization, and Shi et al.'s INR weight decomposition — this paper elevates reparameterization from the network weight level to the tensor factor level.
Tensor completion: TRLRF, FCTN, HLRTF — these discrete methods cannot handle non-grid data, whereas the proposed method achieves superior performance on both data types.

Rating¶

Novelty: ⭐⭐⭐⭐ — TR functionalization + frequency-domain analysis + factor reparameterization constitute three progressive and complementary contributions
Experimental Thoroughness: ⭐⭐⭐⭐ — Four tasks, multiple data types, and thorough ablations; the absence of comparisons with deep learning methods is a minor limitation
Writing Quality: ⭐⭐⭐⭐⭐ — Rigorous theoretical derivations, clear figures and tables, and a coherent narrative from problem identification → analysis → solution → validation
Value: ⭐⭐⭐⭐ — Introduces a new TR format and reparameterization paradigm for tensor functional representations, with considerable room for future extensions