Skip to content

Regularizing INR with Diffusion Prior for Self-Supervised 3D Reconstruction of Neutron CT Data

Conference: CVPR 2025
arXiv: 2603.10947
Code: To be released
Area: 3D Vision / CT Reconstruction
Keywords: Implicit Neural Representation, Diffusion Prior, Neutron CT, Sparse-view Reconstruction, Self-supervised 3D Reconstruction

TL;DR

This paper proposes DINR (Diffusive INR), which combines implicit neural representation (INR/SIREN) with a pretrained diffusion model prior. By regularizing the INR reconstruction with the diffusion denoising output using a proximal loss at each DDIM timestep, DINR outperforms FBP, pure INR, DD3IP, and classical MBIR (qGGMRF) methods on sparse-view neutron CT (down to 4-5 views).

Background & Motivation

Background: Neutron CT is an important imaging modality for characterizing volume based on hydrogen distribution (e.g., fuel cells, lithium-ion batteries, concrete structures). However, the low neutron flux leads to long exposure times, making sparse-view reconstruction urgently needed to accelerate acquisition.

Limitations of Prior Work: FBP produces severe artifacts under sub-Nyquist sampling; MBIR with handcrafted priors (e.g., TV/qGGMRF) requires extensive parameter tuning and has limited expressive power; pure INR (SIREN) lacks strong image priors, resulting in unstable high-frequency reconstruction.

Key Challenge: Diffusion models can model complex image priors, but directly applying them to posterior sampling in inverse problems (e.g., DD3IP/SCD) does not fully exploit data consistency. Conversely, INR can flexibly integrate forward models but lacks learned priors.

Goal: How to combine the strong generative power of diffusion priors with the data consistency advantages of INR to achieve high-quality sparse-view neutron CT reconstruction?

Key Insight: Based on the modular design of the DD3IP framework, INR is used to replace the original data-driven inverse problem solver (DIS) within each DDIM timestep, incorporating the diffusion denoising estimate via a proximal loss.

Core Idea: Embedding INR as a differentiable inverse solver within the DD3IP diffusion framework, achieving online guidance of INR by the diffusion prior through proximal regularization.

Method

Overall Architecture

DINR operates within the DD3IP framework: it initializes the INR parameters \(\phi_T\) (using pure data consistency); in the reverse diffusion loop from \(t=T\) to \(t=1\), it first adapts the diffusion model weights \(\theta_{t-1}\) (SCD) at each step to obtain the denoised estimate \(\hat{x}_t\). Then, it optimizes the INR parameters \(\phi_{t-1}\) using a proximal loss, and finally advances to the next step via DDIM sampling.

Key Designs

  1. Proximal INR Loss Function:

    • Function: Adds a proximal regularization term of the diffusion denoising output to the standard data consistency (projection-domain MSE).
    • Formula: \(\mathcal{L}_\phi(S, y, \hat{x}_{0|t}, \rho) = \text{MSE}(A F_\phi(S, A^*y), y) + \rho \cdot \text{MSE}(\hat{x}_t, F_\phi(S, A^*y))\)
    • Design Motivation: \(\rho\) controls the influence strength of the diffusion prior. At initialization, \(\rho=0\) (pure data fitting), and in subsequent timesteps, the diffusion estimate provides increasingly cleaner prior guidance.
  2. INR Architecture (SIREN + FBP Input):

    • Function: Maps 3D coordinates to attenuation coefficients using SIREN (sine-activated MLP).
    • Mechanism: Accepts the FBP reconstruction \(A^*y\) as an additional input channel to provide an initial estimate and accelerate convergence.
    • Design Motivation: Dual inputs of coordinates and FBP allow the INR to obtain both precise coordinate localization and coarse structural information.
  3. Noise Injection Scaling \(\omega\):

    • Function: Controls the relative scale of the FBP reconstruction \(A^*y\) and noise \(\epsilon\) during reverse diffusion initialization.
    • Formula: \(x_T \leftarrow \sqrt{\alpha_T} A^*y + \sqrt{1-\alpha_T} \epsilon * \omega\)
    • Design Motivation: \(\omega\) acts as a tunable parameter to balance the low-frequency initial estimate with random exploration.

Loss & Training

  • The diffusion model is pretrained on synthetic ellipsoid data (without needing real neutron CT data).
  • SCD adapts the diffusion model weights at each timestep by minimizing \(\text{MSE}(A D_\theta(x_t|y), y)\).
  • INR is re-optimized at each step using the proximal loss, employing Tomosipo to implement a distance-driven parallel-beam projector.

Key Experimental Results

Synthetic Data (256×256, 2 slices)

Number of Views FBP INR (SIREN) DD3IP DINR
4 views 19.31/0.08 14.76/0.18 26.17/0.25 26.27/0.24
8 views 21.67/0.18 28.15/0.35 28.37/0.34 28.56/0.38
16 views 25.27/0.30 30.34/0.54 31.21/0.61 31.30/0.63
32 views 29.62/0.43 32.85/0.66 32.91/0.74 33.43/0.76

Real Neutron CT Data

Number of Views FBP MBIR(qGGMRF) INR DD3IP DINR
5 views 19.9/0.10 21.02/0.04 20.18/0.03 20.89/0.06 21.27/0.05
9 views 22.9/0.33 26.0/0.38 24.08/0.27 25.41/0.34 25.22/0.35
17 views 25.91/0.55 28.1/0.58 27.3/0.54 28.04/0.62 27.56/0.62
33 views 30.11/0.73 31.0/0.77 29.7/0.71 31.19/0.79 31.37/0.77

Ablation Study / ROI Analysis

ROI Size Observation
8×8 ~ 32×32 (Microstructure region) DINR significantly outperforms other methods
48×48 ~ 64×96 (Including background) MBIR is close to or outperforms DINR
  • DINR achieves the optimal reconstruction in microstructural details (pores/boundaries), but its advantage diminishes in large homogeneous background areas.
  • This aligns with the inherent advantage of MBIR's qGGMRF prior in smooth regions.

Key Findings

  • The diffusion model pretrained only on synthetic ellipsoids can effectively guide the reconstruction of real concrete microstructures, demonstrating OOD (out-of-distribution) adaptability.
  • The advantage of DINR is most pronounced in the ultra-sparse range (4-5 views), where data constraints are extremely weak and a strong prior is most critical.
  • The proximal regularization of INR is more flexible than the conjugate gradient DIS of DD3IP, allowing for seamless integration into the forward physical model.
  • Better quantitative metrics are needed—PSNR/SSIM lack sufficient discriminative power in evaluating microstructure reconstruction quality.

Highlights & Insights

  • Modular Diffusion-INR Fusion: The modular design of the DIS within the DD3IP framework allows INR to replace other solvers in a plug-and-play manner; the scalability of this framework is noteworthy.
  • Synthetic Pretraining + OOD Inference: The diffusion model pretrained solely on synthetic data can guide real data reconstruction, reducing the reliance on in-domain training data.
  • Insight from ROI Analysis: Conventional full-image PSNR can obscure the true advantages of a method in crucial regions (microstructures), highlighting the need for task-oriented evaluation.

Limitations & Future Work

  • High computational overhead: both the INR parameters and the diffusion model weights must be optimized at each DDIM timestep.
  • On real data, DINR fails to outperform MBIR across all view numbers (MBIR is superior at 9 and 17 views).
  • \(\rho\) and \(\omega\) require meticulous parameter search; the authors admit that a more comprehensive search could yield better results.
  • Only parallel-beam geometry was validated; it has not been extended to cone-beam or helical CT.
  • Lack of comparison with other learning-based CT reconstruction methods (e.g., end-to-end U-Net).
  • vs DD3IP: DINR replaces the CG iterations in DD3IP with INR as the DIS, achieving better performance under ultra-sparse views.
  • vs MBIR+qGGMRF: MBIR remains competitive at moderate view counts but requires exhaustive grid search for regularization parameters; DINR is superior under ultra-sparse views and in microstructural regions.
  • vs Pure SIREN/INR: Lacking strong priors leads to high-frequency instabilities; the diffusion regularization in DINR effectively addresses this issue.
  • This holds direct reference value for researchers working on industrial/scientific CT that requires low-scan-count acquisitions.

Rating

  • Novelty: ⭐⭐⭐⭐ Embedding INR as DIS within DD3IP is a novel fusion approach.
  • Experimental Thoroughness: ⭐⭐⭐ The data scale is small (2 slices), and there is a lack of ablation studies and more baselines.
  • Writing Quality: ⭐⭐⭐ The method is described clearly, but the experimental analysis could be deeper.
  • Value: ⭐⭐⭐⭐ Holds practical value for the field of scientific CT reconstruction.