NOIR: Neural Operator Mapping for Implicit Representations¶
Conference: CVPR2025
arXiv: 2603.13118
Code: GitHub
Area: Medical Image
Keywords: Implicit Neural Representations, Neural Operator, Resolution-agnostic, Medical Image Segmentation, Image Translation
TL;DR¶
NOIR reformulates medical image computation tasks as operator learning problems between continuous function spaces. It embeds discrete medical signals into a continuous function space via Implicit Neural Representations (INR), and then learns mappings between functions using Neural Operators (NO), achieving resolution-agnostic segmentation, shape completion, image translation, and synthesis.
Background & Motivation¶
- Current deep learning methods for medical image computation are almost entirely based on discrete grids (pixels/voxels), making them highly sensitive to voxel spacing, interpolation methods, and resampling artifacts.
- Medical images naturally originate from continuous physical domains and frequently undergo resampling and transformations in clinical workflows; discrete representations introduce inconsistencies.
- Existing Neural Operator methods (such as FNO) are limited to regular grids and suffer from aliasing errors.
- A resolution-agnostic, task-agnostic, and general framework is needed to unify various downstream medical imaging tasks.
Method¶
Overall Architecture¶
NOIR consists of two core modules: 1. Continuous Function Representation Module: Embeds discrete signals into a continuous function space using INR. 2. Neural Operator Mapping Module: Learns mappings between the latent modulation spaces of the implicit representations.
Continuous Function Representation¶
- For each discrete signal \(f_i^d\), a continuous function \(f_i\) is learned and parameterized by an INR \(\phi_i(x; \theta, \gamma_i)\).
- \(\theta\) represents the dataset-level shared parameters, and \(\gamma_i\) represents the signal-specific parameters.
- Instead of being optimized directly, \(\gamma_i\) is mapped from a latent vector \(z_i\) via a hypernetwork \(M_\psi\): \(\gamma_i = M_\psi(z_i)\).
- Shared parameters capture general structures (background, anatomy, acquisition characteristics), while \(\gamma_i\) encodes individual variations (anatomical variation, pathology).
Meta-learning Training Scheme¶
- Outer Loop: Optimizes the shared parameters \((\theta, \psi)\).
- Inner Loop: Optimizes the latent representation \(z_i\) for each signal (initialized to 0, using K-step gradient descent).
- During testing, \((\theta, \psi)\) are frozen, and only the \(z^*\) for new signals is optimized.
Neural Operator Mapping¶
- Obtain latent representations \(z_{in}^i\) and \(z_{out}^i\) for the input and output signal pairs, respectively.
- Learn the mapping \(\hat{z}_{out} = \mathcal{T}(z_{in})\) using an MLP with residual connections.
- The loss function is MSE: \(\mathcal{L}_{NO} = \frac{1}{N}\sum_{i=1}^N \|\mathcal{T}(z_{in}^i) - z_{out}^i\|_2^2\)
ε-ReNO Theoretical Guarantees¶
- NOIR experimentally satisfies the \(\epsilon\)-ReNO property proposed by Bartolucci et al.
- The cross-resolution difference of the input latent representation is \(\epsilon_{z_{in}} = 4.0 \times 10^{-6}\), and for the output is \(\epsilon_{z_{out}} = 2.2 \times 10^{-5}\).
- The segmentation-level error is \(\epsilon_{seg} = 1.8 \times 10^{-2}\), demonstrating minimal aliasing errors.
Key Experimental Results¶
Segmentation Tasks¶
| Dataset | Method | 100% DSC | 10% DSC |
|---|---|---|---|
| Shenzhen | U-Net | 0.95 | 0.75 |
| Shenzhen | NOIR | 0.94 | 0.93 |
| OASIS-4 | U-Net | 0.95 | 0.40 |
| OASIS-4 | NOIR | 0.85 | 0.85 |
- NOIR significantly outperforms all baselines at low resolution (10%): DSC of 0.93 vs 0.75 for U-Net on Shenzhen.
- On OASIS-4 at 10% resolution, NOIR achieves a DSC of 0.85 vs 0.40 for U-Net, showing an extremely prominent advantage.
Shape Completion (SkullBreak 3D)¶
- NOIR achieves a DSC of 0.86 at 10% resolution, outperforming 3D U-Net's 0.71.
- At full resolution, NOIR scores 0.87 vs 3D U-Net's 0.96; though there is a gap, it remains stable across resolutions.
Image Synthesis and Translation¶
| Task | Method | 100% PSNR |
|---|---|---|
| US Synthesis | NOIR | 31.83 |
| US Synthesis | AttU-Net | 28.55 |
| fastMRI | NOIR | 22.87 |
| fastMRI | DDPM | 25.45 |
- In the ultrasound synthesis task, NOIR leads all baselines significantly with a PSNR of 31.83 (+3.28 dB).
- Its performance in the fastMRI translation task is relatively weaker, but remains perfectly stable across resolutions.
Resolution Robustness¶
- Across five different resolutions from 32² to 200², the DSC remains stable between 0.93 and 0.94.
- While all baselines suffer from significant performance degradation at low resolutions, NOIR remains almost unaffected.
Highlights¶
- Paradigm Innovation: First to systematically unify medical image computation tasks as operator learning between continuous function spaces, covering four major tasks: segmentation, completion, translation, and synthesis.
- Resolution-Agnostic: NOIR maintains highly consistent performance across different resolutions, overcoming traditional methods' dependency on grid resolution.
- \(\epsilon\)-ReNO Validation: Experiments prove that NOIR satisfies the key theoretical properties of neural operators, exhibiting minimal aliasing errors.
- Modular Design: The input INR, output INR, and NO are trained independently, enabling the INR to be reused across different tasks.
- Simple Architecture: The NO module performs function space mapping using only an MLP with residual connections.
Limitations¶
- Performance Gap at Full Resolution: In full-resolution settings for segmentation and shape completion tasks, NOIR performs 1–11% lower than the best grid-based methods.
- Dependency on Supervised Learning: Requires paired labeled data, making it inapplicable to unsupervised scenarios (e.g., registration).
- Coupled Latent Dimension and NO Complexity: High-dimensional latent spaces require larger datasets and more complex NO architectures.
- Insufficient Fine-grained Detail Recovery: Recovery of fine geometric features, such as the orbital region in shape completion, is relatively poor.
- Complex Training Workflow: Requires multi-stage training for the input INR, output INR, and NO, making the overall workflow cumbersome.
Rating¶
- Novelty: ⭐⭐⭐⭐⭐ — Systematically introduces INR + Neural Operator to medical image computation, representing a paradigm-level innovation.
- Experimental Thoroughness: ⭐⭐⭐⭐ — Covering 4 tasks across 5 datasets with complete ablation and theoretical validation, although the dataset scales are relatively small.
- Writing Quality: ⭐⭐⭐⭐ — Clear theoretical modeling and rigorous mathematical derivations.
- Value: ⭐⭐⭐⭐ — Resolution robustness offers clinical value, but the performance gap at full resolution limits immediate practical deployment.