Reparameterized Tensor Ring Functional Decomposition for Multi-Dimensional Data Recovery¶
Conference: CVPR 2026 arXiv: 2603.01034 Code: YangyangXu2002/RepTRFD Area: 3D Vision / Low-Level Vision / Tensor Decomposition Keywords: Tensor Ring Decomposition, Implicit Neural Representation, Reparameterization, Image Inpainting, Point Cloud Recovery, Frequency Analysis
TL;DR¶
This paper proposes RepTRFD, which reparameterizes Tensor Ring factors into the form of "learnable latent tensor × fixed basis" to address the spectral bias problem inherent in INR-parameterized TR factors, achieving state-of-the-art performance across image inpainting, denoising, super-resolution, and point cloud recovery tasks.
Background & Motivation¶
Low-rank tensor decompositions are widely applicable: CP, Tucker, TT, and TR decompositions provide compact representations for multi-dimensional data such as images, videos, remote sensing imagery, and medical imaging. Among these, TR decomposition is particularly efficient for high-order tensor modeling due to its ring-structured topology.
Discrete TR is limited to fixed grids: Conventional TR decomposition is inherently discrete, defined only on fixed meshgrids, and cannot handle continuous signals or resolution-agnostic modeling scenarios (e.g., sparse point clouds).
INR enables functional tensor decomposition: Prior work has extended Tucker/CP/TT decompositions to the continuous domain (e.g., LRTFR, DRO-TFF), but the continuous generalization of TR decomposition remains unexplored.
Directly applying INR to TR factors is ineffective: When INR is directly used to parameterize TR factors, reconstructions are dominated by low-frequency components, with severe loss of high-frequency detail.
Frequency-domain analysis reveals the root cause: The authors theoretically demonstrate that the spectral characteristics of TR factors are directly propagated to the reconstructed tensor — if a factor lacks high-frequency components, the reconstruction will correspondingly lack high frequencies along that dimension.
Spectral bias of INR is the bottleneck: Standard INRs (e.g., SIREN) tend to learn low-frequency components and struggle to capture the high-frequency content required in TR factors, necessitating a new strategy to overcome this limitation.
Method¶
Overall Architecture¶
RepTRFD consists of three core components:
- Shared frequency embedding layer: For each dimensional coordinate \(v_k\), a single sinusoidal layer produces an embedding \(\mathbf{z}_k = \sin(\omega_0(\mathbf{w}v_k + \mathbf{b}))\), with parameters shared across all modes to enhance cross-mode consistency.
- Branch MLP networks: Each mode has a dedicated MLP \(f_{\theta_k}\) that maps the embedding to a latent tensor slice \(\mathcal{C}^{(k)}_{:v_k:} \in \mathbb{R}^{r_k \times R_{k+1}}\), where \(R_{k+1} = \beta r_{k+1}\) (\(\beta \geq 1\) is an expansion factor).
- Reparameterization and TR contraction: The TR factor is obtained via \(\mathcal{G}^{(k)} = \mathcal{C}^{(k)} \times_3 \mathbf{B}^{(k)}\) (where \(\mathbf{B}^{(k)}\) is a fixed basis), and the target tensor entry is reconstructed through the trace operation of TR contraction \(\Phi(\cdot)\).
Key Designs: Reparameterization Strategy¶
- Core Idea: Each TR factor is decomposed into a structured combination of a learnable latent tensor \(\mathcal{C}^{(k)}\) (generated by INR) and a fixed basis \(\mathbf{B}^{(k)}\).
- Theoretical Motivation (Theorem 2): It is proved that there exists a specific basis \(\mathbf{B}\) such that the gradient response ratio for high-frequency components after reparameterization is no less than that in the original parameter space, making optimization more sensitive to high-frequency details and effectively improving training dynamics.
- Initialization Scheme (Theorem 3): A Xavier-style initialization is adopted, \(\mathbf{B}^{(k)}_{ij} \sim \mathcal{U}(-\sqrt{6/(r_{k+1}+R_{k+1})}, \sqrt{6/(r_{k+1}+R_{k+1})})\), ensuring consistent variance in both forward and backward passes.
- Lipschitz Continuity (Theorem 4): The entire RepTRFD mapping is proved to be globally Lipschitz continuous, ensuring that the model is not overly sensitive to input perturbations.
Loss & Training¶
A general framework of a data fidelity term \(\mathsf{L}_{\text{data}}\) plus an optional regularization term \(\mathsf{L}_{\text{reg}}\) is adopted: \(\min_\phi \mathsf{L}_{\text{data}}(g_\phi; \mathcal{O}) + \mathsf{L}_{\text{reg}}(g_\phi)\). Different data terms and regularizers are selected according to the specific task (inpainting / denoising / super-resolution / point cloud recovery).
Key Experimental Results¶
Main Results¶
Image/Video Inpainting: RepTRFD consistently outperforms TRLRF, FCTN, HLRTF, LRTFR, DRO-TFF, and NeurTV on color images, multispectral images (MSI), hyperspectral images (HSI), and video data.
| Dataset | Method | SR=0.1 PSNR | SR=0.2 PSNR | SR=0.3 PSNR |
|---|---|---|---|---|
| Color Image (256²×3) | DRO-TFF | 23.22 | 27.52 | 30.04 |
| NeurTV | 24.16 | 27.81 | 30.28 | |
| RepTRFD | 25.70 | 29.37 | 32.01 | |
| MSI (256²×31) | DRO-TFF | 38.45 | 42.28 | 45.00 |
| RepTRFD | 39.34 | 44.66 | 47.74 |
Point Cloud Recovery (SR=0.2, NRMSE↓):
| Method | Doll | Duck | Frog | Mario |
|---|---|---|---|---|
| WIRE | 0.106 | 0.060 | 0.053 | 0.086 |
| FINER | 0.110 | 0.059 | 0.054 | 0.088 |
| RepTRFD | 0.093 | 0.053 | 0.050 | 0.080 |
Ablation Study¶
- Effect of reparameterization: At SR=0.2, removing reparameterization degrades PSNR from 30.45→27.41 (−3.04 dB) on color images and from 48.67→29.41 (−19.26 dB) on MSI, while the additional computation overhead is only approximately 1 second.
- Effect of expansion factor β: Increasing β from 1 to 10 consistently improves PSNR and accelerates convergence, with diminishing returns at larger values.
- Sensitivity of basis initialization: On HSI Botswana, varying the initialization scale \(a\) from 0.01 to 1 shows that the theoretically derived value \(a \approx 0.165\) achieves the best PSNR (45.27), with significant degradation when \(a\) is either too small or too large.
- Shared frequency embedding: Compared to independent embeddings, the shared embedding yields more stable training and stronger resistance to overfitting.
- Model complexity comparison: Under matched parameter counts and FLOPs, RepTRFD consistently outperforms LRTFR.
Key Findings¶
- The gains from reparameterization are particularly pronounced on higher-order data (HSI, video), with improvements reaching 19 dB on MSI inpainting.
- Computational overhead is minimal (~1s), primarily because the fixed basis \(\mathbf{B}\) does not participate in gradient updates.
- On super-resolution tasks, RepTRFD runs 10–30× faster than pure INR methods (SIREN/WIRE/FINER) while achieving approximately 1 dB higher PSNR.
Highlights & Insights¶
- First extension of TR decomposition to the continuous domain, filling the gap in tensor functional representation for the TR format.
- Novel frequency-domain analysis perspective: The paper theoretically establishes the mechanism by which the spectrum of TR factors propagates to that of the reconstructed tensor, providing a theoretical foundation for understanding frequency bottlenecks in tensor functional representations.
- Simple yet effective reparameterization strategy: Introducing a single fixed basis matrix — without modifying the network architecture or training procedure — significantly improves the model's ability to learn high-frequency components.
- Comprehensive theoretical guarantees: Covering gradient dynamics (Theorem 2), initialization (Theorem 3), and Lipschitz continuity (Theorem 4), the theoretical analysis is both rigorous and practically informative.
- Unified framework across four tasks: Inpainting, denoising, super-resolution, and point cloud recovery all share the same architecture, demonstrating strong generality.
Limitations & Future Work¶
- Manual hyperparameter tuning is still required: TR ranks \(r_k\), expansion factor \(\beta\), and frequency \(\omega_0\) must be tuned per task; an adaptive rank selection mechanism is absent.
- Validation limited to 3rd- and 4th-order tensors: Although theoretically extendable to arbitrary orders, experiments only cover 3rd-order (image/MSI/HSI) and 4th-order (video) data; higher-order scenarios (e.g., light fields, spatio-spectral data) are not evaluated.
- No comparison with end-to-end deep learning methods: All baselines are traditional optimization or INR-based methods; supervised approaches such as U-Net or Transformer architectures are not considered.
- Point cloud recovery tested only on small-scale models: The SHOT dataset is limited in scale; scalability on large-scale scenes (e.g., ShapeNet, real-world LiDAR point clouds) is not verified.
- Fixed basis is not learned: \(\mathbf{B}^{(k)}\) is frozen after initialization, which may cap expressive capacity; learnable or adaptive bases could be explored in future work.
Related Work & Insights¶
- INR methods: SIREN (sinusoidal activations), WIRE (wavelets), FINER (variable frequencies) — the proposed method is complementary from a tensor decomposition perspective, offering faster speed with higher accuracy.
- Tensor functional representations: LRTFR (continuous Tucker) and DRO-TFF (deep rank-one decomposition) — this paper is the first to introduce the TR format into this framework, and addresses spectral bias through reparameterization.
- Reparameterization techniques: Weight Normalization, RepVGG structural reparameterization, and Shi et al.'s INR weight decomposition — this paper elevates reparameterization from the network weight level to the tensor factor level.
- Tensor completion: TRLRF, FCTN, HLRTF — these discrete methods cannot handle non-grid data, whereas the proposed method achieves superior performance on both data types.
Rating¶
- Novelty: ⭐⭐⭐⭐ — TR functionalization + frequency-domain analysis + factor reparameterization constitute three progressive and complementary contributions
- Experimental Thoroughness: ⭐⭐⭐⭐ — Four tasks, multiple data types, and thorough ablations; the absence of comparisons with deep learning methods is a minor limitation
- Writing Quality: ⭐⭐⭐⭐⭐ — Rigorous theoretical derivations, clear figures and tables, and a coherent narrative from problem identification → analysis → solution → validation
- Value: ⭐⭐⭐⭐ — Introduces a new TR format and reparameterization paradigm for tensor functional representations, with considerable room for future extensions