Neural Inverse Rendering from Propagating Light¶

Conference: CVPR 2025
arXiv: 2506.05347
Code: https://anaghmalik.com/InvProp
Area: Autonomous Driving
Keywords: Time-Resolved Inverse Rendering, LiDAR, Indirect Illumination, Radiance Caching, Physics-Based Rendering

TL;DR¶

The first method for physics-based inverse rendering from multi-view time-resolved LiDAR measurements (time-of-flight photon detection). It replaces recursive path tracing with a time-resolved radiance cache to model direct and indirect light transport, reducing normal MAE on synthetic scenes from 22.80° (FWP++) to 8.45°, while supporting novel view synthesis and relighting.

Background & Motivation¶

Background: Traditional inverse rendering methods recover geometry and materials from RGB images, but fail to handle strong indirect illumination (such as multiple bounces indoors) effectively. Time-resolved LiDAR (SPAD detectors) captures photon flight time, providing additional temporal constraints.
Limitations of Prior Work: (1) T-NeRF only models direct light, leading to severe distortion in scenes with strong indirect illumination; (2) FWP++ handles indirect light but lacks a physical model, restricting geometry reconstruction accuracy; (3) Recursive path tracing is computationally prohibitive and unsuitable for embedding in neural network optimization loops.
Key Challenge: Accurately modeling indirect light requires solving the complete rendering equation (recursive solving), but recursion is non-differentiable or computationally explosive.
Goal: Replace recursive path tracing with neural radiance caching to achieve differentiable physics-based inverse rendering.
Key Insight: The photon travel time in time-resolved data provides constraints on light path lengths—determining not only how much light reaches the detector but also how long the path taken is (distinguishing direct and indirect light).
Core Idea: Direct/indirect light decomposition + neural radiance caching (hash encoding) + split-sum approximation to handle indirect light BRDF integration.

Method¶

Overall Architecture¶

Multi-view time-resolved LiDAR measurements \(\to\) Neural Geometry Network (density \(\sigma\) + normal \(n\)) \(\to\) Appearance Feature Hash Encoding \(\to\) Direct Light Cache (analytic Fresnel + light source visibility) + Indirect Light Cache (split-sum approximation) \(\to\) Disney-GGX BRDF \(\to\) Volume Rendering and Transient Signal Comparison Optimization.

Key Designs¶

Time-Resolved Radiance Caching
- Function: Replaces recursive path tracing to efficiently model indirect light.
- Mechanism: Splits incoming radiance into two terms: direct \(L_o^{cache,dir}\) and indirect \(L_o^{cache,indir}\). Direct light is computed with analytic BRDF + light source positions. Indirect light is approximated using split-sum: \(L_o^{indir} = f_\Omega^{indir}(f^{app}, n, \omega_o) \cdot L_{i,\Omega}^{indir}(f^{app}, x_\ell, n, \omega_o)\), where both terms are approximated by MLPs.
- Design Motivation: Radiance caching avoids recursive solving—it stores incoming radiance as a continuous function of space, direction, and time, requiring only table lookups.
Direct/Indirect Light Decomposition
- Function: Accurately models direct light using physical BRDF models.
- Mechanism: Direct light is calculated using the full Disney-GGX BRDF: \(L_o^{dir} = f^{dir}(f^{app}, n, \omega_\ell, \omega_o) L_i^{dir}(x', \omega_\ell, \tau)(n \cdot \omega_\ell)\), while indirect light is approximated via split-sum due to analytical integration limits.
- Design Motivation: Direct light has an analytical form (known light source positions), making exact modeling superior to approximation; indirect light integration is complex and can only be approximated.
Multi-Resolution Hash Encoding
- Function: Efficiently represents spatially-varying appearance features.
- Mechanism: \(f^{app} = \mathcal{H}^{app}(x)\), leveraging multi-level hash encoding to capture material changes across different scales.
- Design Motivation: Hash encoding has been shown in Instant-NGP to be highly efficient for spatial features.

Loss & Training¶

\(\mathcal{L} = \mathcal{L}_{data} + \lambda_{cache} \mathcal{L}_{cache} + \lambda_{dir} \mathcal{L}_{dir} + \lambda_{indir} \mathcal{L}_{indir} + \text{regularizers}\). Normal smoothness, depth distortion, and mask regularizations are applied.

Key Experimental Results¶

Main Results¶

Method	Synthetic PSNR↑	Synthetic Normal MAE↓	Synthetic Depth L1↓
T-NeRF	22.44	28.00°	0.59
FWP++	29.00	22.80°	0.47
Ours	30.99	8.45°	0.21

Ablation Study¶

Setting	Key Observations
Direct light only	Severe distortion in scenes with strong indirect light
w/o split-sum	Degraded indirect light modeling
w/o time-resolved	Loss of light path length constraints

Key Findings¶

Normal accuracy improved by 3.2x compared to FWP++ (22.80° \(\to\) 8.45° MAE)—a key advantage of the physical model.
On real captured data, PSNR is slightly lower than FWP++ (27.39 vs 28.45), possibly due to calibration errors.
Supports relighting (re-rendering after changing light source positions), which FWP++ cannot do.

Highlights & Insights¶

First combination of time-resolved LiDAR with physics-based inverse rendering: Photon travel time provides constraints unattainable via conventional RGB.
Radiance caching replaces path tracing: Converted the non-differentiable recursive process into a differentiable neural network lookup, which is highly elegant in engineering.
Supports relighting: Obtains physical material parameters (albedo, roughness, metallic) and enables re-rendering under arbitrary illumination conditions.

Limitations & Future Work¶

Requires specialized hardware (picosecond lasers + SPAD detectors), making it unsuitable for consumer-grade devices.
Calibration errors in real data directly affect reconstruction quality.
The Disney-GGX BRDF cannot model all materials (e.g., transparency, subsurface scattering).
Relighting requires fine-tuning direct/indirect loss, which is not fully automated.

vs T-NeRF: Models only direct light, failing completely in the Cornell Box scene with strong indirect illumination.
vs FWP++: Non-physical model leading to poor geometric accuracy (normal MAE 22.8°), and does not support relighting.

Rating¶

Novelty: ⭐⭐⭐⭐⭐ First combination of time-resolved LiDAR and physics-based inverse rendering.
Experimental Thoroughness: ⭐⭐⭐⭐ Synthetic + real data + multiple metrics, but limited number of scenes.
Writing Quality: ⭐⭐⭐⭐ Rigorous physical derivations.
Value: ⭐⭐⭐⭐ Opens up new directions for inverse rendering in indirect illumination scenes.