7DGS: Unified Spatial-Temporal-Angular Gaussian Splatting¶

Conference: ICCV 2025 arXiv: 2503.07946 Code: https://gaozhongpai.github.io/7dgs/ Area: 3D Vision / Dynamic Scene Rendering / 3D Gaussian Splatting Keywords: 7D Gaussian, Dynamic Scene, View-Dependent, Conditional Slicing, Real-Time Rendering

TL;DR¶

This paper proposes 7DGS, which models scene elements as 7-dimensional Gaussian distributions (3D spatial + 1D temporal + 3D view direction). A conditional slicing mechanism converts 7D Gaussians into time- and view-conditioned 3D Gaussians, unifying dynamic scene rendering with view-dependent appearance. On the proposed 7DGS-PBR dataset, 7DGS achieves up to 7.36 dB PSNR gain over 4DGS while using only 15.3% of the Gaussian primitives, with real-time rendering at 401 FPS.

Background & Motivation¶

High-quality real-time rendering of dynamic scenes requires simultaneous modeling along three dimensions: (1) spatial geometry, (2) temporal dynamics, and (3) view-dependent appearance. Existing methods address only subsets of these: 4DGS handles dynamics (space + time) but ignores view-dependent effects; 6DGS handles view dependence (space + direction) but is limited to static scenes. In the real world, all three dimensions are mutually coupled — for instance, specular highlights on a moving object vary simultaneously with object position and viewing direction. No prior method addresses all three in a unified framework.

Core Problem¶

How to simultaneously model spatial, temporal, and view-directional dependencies within a unified framework, enabling real-time rendering of dynamic scenes with view-dependent appearance?

Method¶

Overall Architecture¶

Each scene element is represented as a 7D Gaussian $\mathcal{N}(\mu, \Sigma)$, where $\mu = [\mu_p, \mu_t, \mu_d]$ (3D position + 1D time + 3D direction) and $\Sigma$ is a $7 \times 7$ covariance matrix. At render time, given the current time $t$ and viewing direction $d$, conditional slicing produces a 3D Gaussian, which is then passed to the standard 3DGS rasterization pipeline.

Key Designs¶

7D Gaussian Representation: The $7 \times 7$ covariance matrix is parameterized via Cholesky decomposition $\Sigma = LL^T$ to ensure positive definiteness. Cross-covariance blocks $\Sigma_{pt}$, $\Sigma_{pd}$, and $\Sigma_{td}$ encode spatial-temporal-directional coupling — a critical design that enables a single Gaussian to simultaneously capture motion-induced positional changes and view-dependent appearance variations.
Conditional Slicing Mechanism: For a given $(t, d)$, the conditional distribution formula for multivariate Gaussians is applied to slice the 7D Gaussian into a conditioned 3D Gaussian: $$\mu_{cond} = \mu_p + \Sigma_{p,(t,d)} \Sigma_{(t,d)}^{-1} \begin{pmatrix} t - \mu_t \\ d - \mu_d \end{pmatrix}$$ $$\Sigma_{cond} = \Sigma_p - \Sigma_{p,(t,d)} \Sigma_{(t,d)}^{-1} \Sigma_{p,(t,d)}^T$$ The conditional opacity is modulated by temporal and directional attenuation factors: $\alpha_{cond} = \alpha \cdot f_{temp} \cdot f_{dir}$.
Adaptive Gaussian Refinement (AGR): Conditional slicing adjusts position and opacity but keeps the covariance shape static. AGR employs a lightweight MLP (2 layers × 64 units) to predict residual corrections for key parameters (position, time, direction, covariance), dynamically adjusting Gaussian shape based on temporal encoding $\gamma(t)$ to capture non-rigid deformations.
Compatibility with 3DGS Pipeline: The sliced conditional 3D Gaussians are directly fed into the standard 3DGS projection and rasterization pipeline, leveraging existing adaptive density control and efficient rendering, with only a minimal opacity threshold ($\tau_{min}=0.01$) added.

Loss & Training¶

Same loss function as 3DGS (L1 + SSIM)
Temporal densification: splitting triggered when spatial-temporal covariance $\Sigma_{pt}$ magnitude $> 0.05$ and temporal scale $> 0.25$
Single V100 (16 GB), Adam optimizer
AGR training begins after 3,000 steps; $\lambda_t$ and $\lambda_d$ become trainable after 15,000 steps

Key Experimental Results¶

7DGS-PBR Dataset (Dynamic + View-Dependent)¶

Method	PSNR↑	SSIM↑	LPIPS↓	#Points	FPS
4DGS	27.79	0.934	0.079	641K	193
7DGS	32.50	0.958	0.051	98K	175
7DGS (w/o AGR)	31.77	0.955	0.055	88K	376

+4.71 dB PSNR with only 15.3% of the Gaussian primitives. The heart1 scene achieves +8.18 dB.

D-NeRF Dataset (Synthetic Dynamic)¶

Method	PSNR↑
4DGaussians	33.30
4DGS	33.21
7DGS	34.34

Technicolor Dataset (Real Multi-View)¶

Method	PSNR↑	SSIM↑
Ex4DGS	33.49	0.917
STG	33.23	0.912
7DGS	33.58	0.912

Ablation Study: w/o AGR vs. w/ AGR¶

7DGS-PBR: 31.77 vs. 32.50 (+0.73 dB), with FPS dropping from 376 to 175
D-NeRF: 33.26 vs. 34.34 (+1.08 dB)
AGR contributes substantially to quality at the cost of rendering speed

Highlights & Insights¶

Elegant unification: The 7D Gaussian naturally encodes spatial-temporal-directional cross-covariances, yielding a mathematically complete and intuitively grounded representation
Elegance of conditional slicing: Reduces the high-dimensional problem to a standard 3D problem, remaining fully compatible with existing pipelines
Remarkable parameter efficiency: Surpassing 4DGS by 4.7 dB with only 15% of the Gaussian count — a testament to the compressive power of high-dimensional representations
View-dependent effect breakthrough: On scenes such as hearts, clouds, and flames in 7DGS-PBR, view-dependent effects that 4DGS entirely fails to handle are effectively captured by 7DGS
Flexible degraded mode: Removing AGR yields 400+ FPS, suitable for speed-critical applications

Limitations & Future Work¶

The AGR MLP introduces additional computational overhead (FPS drops from 376 to 175)
Advantages on real-world data such as Technicolor are relatively modest
The 7D covariance has 28 independent parameters (Cholesky lower triangle), making training less stable than 3DGS
Explicit motion modeling strategies such as keyframe interpolation have not been incorporated

4DGS: 4D only (space + time); cannot handle view-dependent effects. 7DGS outperforms it across all datasets
6DGS: 6D (space + direction) but static only. 7DGS adds the temporal dimension to unify dynamics and view dependence
Ex4DGS: Models motion explicitly via keyframe interpolation. Results on Technicolor are comparable, suggesting complementary strengths
SSS / 3D-HGS: Improve kernel functions. 7DGS pursues dimensional extension — an orthogonal direction that could be combined with these approaches

The conceptual progression 3DGS (3D) → 4DGS (4D) → 6DGS (6D) → 7DGS (7D) exemplifies an elegant "unification through dimensionality" paradigm. Conditional slicing is grounded in elementary multivariate Gaussian conditioning, yet proves remarkably powerful. This framework paves the way for even higher-dimensional representations — e.g., 8D incorporating wavelength for spectral rendering.

Rating¶

Novelty: ⭐⭐⭐⭐⭐ — The 7D unified representation combined with conditional slicing is elegant and natural, filling the gap between 4DGS and 6DGS
Experimental Thoroughness: ⭐⭐⭐⭐ — Three datasets, SOTA comparisons, and ablations; additional real-world validation would strengthen the work
Writing Quality: ⭐⭐⭐⭐⭐ — Mathematical derivations are rigorous and complete; the logical progression from 3DGS to 6DGS to 7DGS is clearly articulated
Value: ⭐⭐⭐⭐⭐ — Unifies dynamic and view-dependent rendering in a single framework; the 15% point count with +5 dB gain demonstrates high practical utility