Faster and Better 3D Splatting via Group Training¶

Conference: ICCV 2025 arXiv: 2412.07608 Code: Project Page Area: 3D Vision Keywords: 3D Gaussian Splatting, Training Acceleration, Group Training, Opacity-based Sampling, Novel View Synthesis

TL;DR¶

This paper proposes a Group Training strategy that accelerates 3DGS training by periodically partitioning Gaussian primitives into an "under-training group" and a "cached group," combined with Opacity-based Priority Sampling (OPS). Across four standard benchmarks, the method achieves approximately 30% training speedup while simultaneously improving rendering quality and reducing model size, and can be applied as a plug-and-play module to 3DGS and Mip-Splatting frameworks.

Background & Motivation¶

Core Problem¶

3D Gaussian Splatting (3DGS) demonstrates remarkable real-time, high-quality rendering for novel view synthesis, but training efficiency is bottlenecked by the exponentially growing number of Gaussian primitives (typically reaching millions), which substantially increases the training burden.

Limitations of Prior Work¶

The most straightforward acceleration strategy is to periodically prune low-opacity Gaussians to reduce primitive count: - A conservative threshold $\epsilon_\alpha$ → negligible speedup - An aggressive threshold → severe degradation in rendering quality - Extremely high parameter sensitivity (see Figure 2 left), making a good trade-off difficult to find

Key Intuition¶

Cache rather than prune — temporarily exclude some Gaussians from training while retaining them, and periodically resample to rotate cached Gaussians back into training. This reduces per-iteration computation without discarding "important" primitives.

Method¶

Overall Architecture¶

Group Training periodically partitions all Gaussian primitives into two groups during training:

\[G_{\text{Under-training}} = \{g_i | g_i \in G, i \in I, I \subseteq \{1,2,...,|G|\}\}$$ $$G_{\text{Cached}} = G \setminus G_{\text{Under-training}}\]

Under-training group: participates in rendering and gradient optimization
Cached group: excluded from all computation temporarily
Groups are merged and resampled every 500 iterations
The under-training ratio (UTR) defaults to 0.6

Training Schedule¶

Densification phase (0–15K iter): Group Training operates normally; a global densification (merging all groups) is performed at 14.5K
Optimization phase (15K–30K iter): Group Training continues; a global optimization step is performed at 29K
Group Training begins at iteration 500 to protect the importance of initial Gaussians

Key Designs: Sampling Strategies¶

Random Sampling (RS)¶

The simplest approach — uniform random sampling. Experiments show RS already accelerates training, but may lead to excessive redundant Gaussians.

Opacity-based Priority Sampling (OPS)¶

OPS is the core contribution of this paper. The sampling probability is defined as:

\[p_i = \frac{\alpha_i}{\sum_{i=1}^{N} \alpha_i}\]

That is, Gaussians with higher opacity are preferentially selected into the under-training group. This is supported by two mathematical propositions:

Proposition 1: Higher Opacity Promotes Effective Densification

The gradient of the loss with respect to Gaussian position is:

\[\frac{\partial L}{\partial x_m} = o_m \sum_{\text{pixel}} \frac{\partial L}{\partial \hat{C}} \frac{\partial \hat{C}}{\partial \alpha_m} \frac{\partial G_m^{2D}}{\partial \Delta x} \frac{\partial \Delta x}{\partial x_m}\]

The gradient is proportional to opacity $o_m$
$\frac{\partial \hat{C}}{\partial \alpha_m}$ increases as the expected $o_m$ increases
Therefore, high-opacity Gaussians are more likely to satisfy the densification threshold $\tau_{\text{grad}}$ and are the primary contributors to densification

Proposition 2: Higher Opacity Accelerates Rendering

The termination condition of α-blending depends on transmittance $T_i = \prod_{j=1}^{i-1}(1-\alpha_j)$ reaching a saturation threshold:

\[\mathbb{E}[T_N] = (1 - \mathbb{E}[o_i] \cdot \mathbb{E}[G_i^{2D}])^N\]

Higher opacity → faster α saturation → fewer blending steps $N$ → faster rendering
Experimentally verified: as $\mu_o$ increases, rendering time decreases by approximately 40%

Design Summary¶

Caching low-opacity Gaussians (retaining high-opacity ones in the under-training group) yields three simultaneous benefits: promoting effective densification, reducing redundant primitives, and accelerating rendering.

Key Experimental Results¶

Main Results: 3DGS Reconstruction Efficiency and Quality (Tables 1 & 2)¶

Method	Config	Mip-NeRF360 PSNR↑	Time (min)↓	Model Size (MB)↓
3DGS*	baseline	27.445	26.7	792
+ GT w/ RS	0~30K	27.537	22.6	902
+ GT w/ OPS	0~15K	27.582	22.5	678
+ GT w/ OPS	0~30K	27.564	19.6	679

Method	T&T PSNR↑	Time↓	Deep Blending PSNR↑	Time↓
3DGS*	23.697	15.0	29.586	23.9
+ GT w/ OPS (0~30K)	23.853	11.0	29.746	16.9

Core result: OPS achieves ~27–30% training speedup across all scenes, with improved quality and 10–40% reduction in model size.

Validation on Mip-Splatting (Table 3)¶

Method	T&T PSNR↑	Time↓	Deep Blending PSNR↑	Time↓
Mip-Splatting*	23.749	23.0	29.358	35.1
+ GT w/ OPS (0~30K)	24.156	18.2	29.788	24.0

Speedup: 21% faster on T&T, 32% faster on Deep Blending, with consistent quality improvements.

Ablation Study (Table 5, Tanks & Temples)¶

Periodic Resampling	Global Densification	Global Optimization	PSNR↑	Model Size (MB)↓	Time (min)↓
-	-	-	23.697	434	15.0
✓	-	-	23.866	292	11.8
✓	✓	-	23.769	231	11.0
✓	-	✓	23.835	485	11.8
✓	✓	✓	23.853	384	11.0

Key Findings: - Periodic resampling contributes the largest speedup (15.0 → 11.8 min) - Global densification further reduces model size (292 → 231 MB) - Global optimization improves quality when combined with global densification (23.769 → 23.853)

Counter-Intuitive Findings¶

RS increases model size yet still accelerates training: Group Training w/ RS produces more primitives, yet training is still faster — indicating that training dynamics rather than model size determine efficiency.
OPS yields both higher quality and smaller models: The simultaneous improvement in quality and reduction in model size suggests that performance gains stem from the optimization process rather than model capacity.

Highlights & Insights¶

Simplicity with rigor: The core idea is extremely straightforward (grouping + rotation), yet the paper provides a thorough mathematical analysis (complete proofs of two propositions) explaining the underlying reasons for its effectiveness.
Plug-and-play: The method integrates seamlessly into 3DGS and Mip-Splatting as a training strategy without any architectural modifications.
Dual role of opacity: This work is the first to reveal that opacity simultaneously governs the effectiveness of densification and rendering speed — an underappreciated yet critical property in 3DGS.
Hyperparameter robustness: In contrast to the extreme sensitivity of pruning thresholds, the Group Training cache ratio maintains stable performance over a wide range of values (Figure 2 right).

Limitations & Future Work¶

Only validated at 30K iterations: Performance under longer training schedules or very large-scale scenes remains unexplored.
GPU memory overhead: Merging all groups may increase peak memory consumption, partially mitigated by fixing spherical harmonic coefficients.
Independence assumption in OPS: The proofs assume Gaussian attributes are mutually independent, which may not hold in practice.
Only two frameworks tested: Generalizability to newer variants such as 2DGS and InstantSplat has not been verified.

Fundamental distinction from pruning: Pruning permanently removes low-opacity Gaussians, whereas Group Training only temporarily caches them with periodic rotation — preserving the opportunity for important primitives to contribute.
Relationship to model compression: OPS naturally produces more compact models (fewer redundant Gaussians) and is complementary to post-training compression methods.
Implications for 3DGS training: The opacity distribution serves as a key entry point for understanding and optimizing 3DGS behavior; future work could explore dynamically adjusting UTR or employing adaptive sampling strategies.

Rating ⭐⭐⭐⭐¶

Novelty: ⭐⭐⭐⭐ — A simple idea backed by rigorous theoretical analysis, yielding a surprising quality–efficiency win Practicality: ⭐⭐⭐⭐⭐ — Plug-and-play, no hyperparameter sensitivity, engineering-friendly Experimental Thoroughness: ⭐⭐⭐⭐ — 4 datasets × 2 frameworks, with complete ablations and sampling strategy comparisons Writing Quality: ⭐⭐⭐⭐ — Proposition proofs are clearly presented; experimental organization is systematic