Thermal3D-GS: Physics-induced 3D Gaussians for Thermal Infrared Novel-view Synthesis¶
Conference: ECCV 2024
arXiv: 2409.08042
Code: Project Page
Area: 3D Vision
Keywords: Thermal Infrared Imaging, Novel-view Synthesis, 3D Gaussian Splatting, Physical Modeling, Atmospheric Transmission
TL;DR¶
This paper proposes Thermal3D-GS, which models atmospheric transmission effects and thermal conduction physical processes using neural networks and introduces temperature consistency constraints, achieving high-quality novel-view synthesis of thermal infrared images, and establishing the first large-scale thermal infrared novel-view synthesis dataset, TI-NSD.
Background & Motivation¶
Thermal infrared imaging possesses all-weather imaging capabilities and strong penetration, presenting significant advantages in nighttime and harsh weather scenarios. However, directly applying visible-light novel-view synthesis methods to thermal infrared images leads to two specific issues: (1) Floaters—atmospheric transmission effects cause varying radiation attenuation of the same object under different views, prompting 3D-GS to learn incorrect compensating Gaussians; (2) Blurry edges—heat conduction between objects alters the boundary temperature gradients, and averaging across multiple frames leads to the loss of edge information.
Method¶
Overall Architecture¶
Based on the 3D-GS framework, two physics-driven modules are added: (1) Atmospheric Transmission Field (ATF)—modeling atmospheric attenuation; (2) Thermal Conduction Module (TCM)—modeling the impact of thermal conduction on edges. Additionally, a temperature discontinuity loss constraint is introduced.
Key Designs¶
Atmospheric Transmission Field (ATF): Models radiation attenuation based on the Bouguer-Lambert-Beer law \(I = I_0 e^{\mu(\lambda)d}\). An MLP network (depth 8, hidden dimension 256) takes the position-encoded Gaussian location \(\gamma(x)\) and capture time \(\gamma(t)\) as inputs to predict the absorption coefficient \(\mu_{abs}\), scattering coefficient \(\mu_{sca}\), and propagation distance \(d\):
This decouples attenuation effects from the geometry, allowing 3D-GS to independently learn attenuation-free geometric structures.
Thermal Conduction Module (TCM): Based on the 2D temperature field heat conduction equation \(\frac{\partial u}{\partial t} = \alpha \Delta u\), where \(\Delta\) is the 2D Laplacian operator, and \(\alpha = k/(c\rho)\) is the thermal diffusivity. Since \(\alpha\) is non-uniform across pixels, a 3-layer convolutional network is employed to fuse the input image with its second-order gradient features to simulate pixel-level \(\alpha\), repairing the heat loss caused by thermal conduction through a residual addition mechanism.
Temperature Discontinuity Loss: Real temperature fields are typically smooth and continuous; corner points denote learning anomalies. The response function of Harris corner detection is used as a weight to emphasize the L1 loss on abnormal regions:
where \(iter_t = 5000\) serves as the decay threshold.
Loss & Training¶
where \(\lambda_{dis} = \lambda = 0.2\).
Key Experimental Results¶
Quantitative Comparison on TI-NSD Dataset¶
Average results across 20 scenes:
| Method | Indoor PSNR↑ | Outdoor PSNR↑ | UAV PSNR↑ | Mean PSNR↑ | Mean SSIM↑ | Mean LPIPS↓ |
|---|---|---|---|---|---|---|
| Plenoxels | 22.13 | 22.15 | 25.56 | 23.28 | 0.805 | 0.390 |
| INGP-Base | 26.99 | 26.00 | 20.86 | 24.62 | 0.811 | 0.332 |
| INGP-Big | 27.46 | 26.45 | 20.82 | 24.91 | 0.812 | 0.323 |
| 3D-GS (30k) | 32.98 | 28.89 | 34.51 | 32.01 | 0.936 | 0.206 |
| Ours (30k) | 36.01 | 32.60 | 36.74 | 35.04 | 0.955 | 0.187 |
Ablation Study¶
| Method | Indoor PSNR↑ | Outdoor PSNR↑ | UAV PSNR↑ | Mean PSNR↑ |
|---|---|---|---|---|
| 3D-GS | 32.98 | 28.89 | 34.51 | 32.01 |
| 3D-GS + ATF | 35.12 | 31.53 | 36.65 | - |
| 3D-GS + ATF + TCM + Loss | 36.01 | 32.60 | 36.74 | 35.04 |
Key Findings¶
- Compared to the 3D-GS baseline, the average improvement is 3.03 dB PSNR (35.04 vs 32.01), representing a significant enhancement.
- The ATF module contributes the most (~2 dB), effectively eliminating the floater issue.
- TCM shows obvious performance improvements in boundary regions between high-and-low-temperature objects, recovering sharp edges blurred by thermal conduction.
- In UAV scenes, fast flight speeds cause motion blur, yet our method still maintains excellent reconstruction quality.
Highlights & Insights¶
- The first method dedicated to thermal infrared novel-view synthesis, incorporating physical imaging processes (atmospheric transmission, thermal conduction) into the 3D-GS framework.
- The TI-NSD dataset (20 scenes, 6664 frames, covering indoor/outdoor/UAV) fills the dataset gap in this field.
- Physics-formulation-driven network design—instead of directly solving physical equations (which is an underdetermined problem), physical equations are used to guide the network structural design.
Limitations & Future Work¶
- It assumes that each Gaussian shares a uniform attenuation coefficient, which may not apply to scenarios with drastically changing attenuation.
- TCM only operates in the 2D image space, without modeling thermal conduction in 3D space.
- The temperature discontinuity loss relies on the accuracy of Harris corner detection.
Related Work & Insights¶
This paper pioneeringly combines thermal infrared physical characteristics with 3D reconstruction. The concept of physics-driven degradation modeling (decoupling imaging effects from geometry) can be extended to novel-view synthesis in other specialized imaging modalities (e.g., SAR, medical imaging).
Rating¶
- Novelty: ⭐⭐⭐⭐⭐
- Practicality: ⭐⭐⭐⭐
- Experimental Thoroughness: ⭐⭐⭐⭐
- Writing Quality: ⭐⭐⭐⭐