3D Student Splatting and Scooping (SSS)¶

Conference: CVPR 2025
arXiv: 2503.10148
Code: https://github.com/realcrane/3D-student-splating-and-scooping
Area: 3D Vision / Novel View Synthesis / 3D Gaussian Splatting
Keywords: Student-t Distribution, Negative Components, Scooping, SGHMC, Parameter Efficiency

TL;DR¶

This work proposes SSS (Student Splatting and Scooping), advancing the 3DGS paradigm with three unprecedented innovations: (1) replacing Gaussian distributions with Student-t distributions as mixture components (with learnable tail thickness that varies continuously from Cauchy to Gaussian); (2) introducing negative density components (scooping by subtracting color) to extend the formulation to non-monotonic mixture models; (3) employing SGHMC sampling instead of SGD to decouple parameter optimization. SSS achieves state-of-the-art results in 6 out of 9 metrics across Mip-NeRF360, T&T, and Deep Blending, demonstrating extreme parameter efficiency by matching or exceeding 3DGS using only 18% of the component count.

Background & Motivation¶

3DGS is fundamentally an unnormalized Gaussian mixture model whose success relies on three key pillars: Gaussian distributions as components, splatting-style positive density aggregation, and SGD optimization. However, all three have room for improvement: (1) Gaussian distributions act as low-pass filters and are inefficient at modeling discontinuous targets (such as boundaries and sharpness), requiring a vast number of components to fit sharp transitions; (2) relying solely on positive density limits expressiveness, rendering it impossible to "scoop out" undesired areas; (3) introducing more flexible distributions exacerbates parameter coupling, making SGD prone to local minima. Recent works (such as GES, 3DHGS, and MCMC-GS) have explored these directions individually but have not unified them.

Core Problem¶

How to fundamentally improve the three foundational elements of 3DGS—distribution selection, density space, and optimization method—to make it more expressive and parameter-efficient?

Method¶

Overall Architecture¶

SSS is a comprehensive replacement for 3DGS that simultaneously improves three avenues: - Component: Gaussian $\to$ Student-t distribution (with learnable degrees of freedom $\nu$) - Density Space: Positive-only density $\to$ Positive + negative density (splatting + scooping) - Optimization: SGD $\to$ SGHMC sampling (friction + noise scheduling)

Key Designs¶

Student-t Distribution as Chicago Base Component: $T(x|\nu) = [1 + \frac{1}{\nu}(x-\mu)^T\Sigma^{-1}(x-\mu)]^{-\frac{\nu+3}{2}}$. As $\nu\to 1$, it approaches a Cauchy distribution (heavy-tailed, allowing a single component to cover a large area), and as $\nu\to\infty$, it approaches a Gaussian. Making $\nu$ learnable enables SSS to learn mixtures from an infinite family of distributions. Key property: the Student-t distribution has closed-form solutions under affine transformations and marginalization, allowing it to be directly utilized for projection and integration in splatting.
Negative Density Components (Scooping): $o \in [-1, 1]$, where negative opacity corresponds to color subtraction. Opacity is constrained using tanh. Key insight: A torus fitting experiment demonstrates that 2 components (1 positive + 1 negative) can fit a topology that would otherwise require 5 positive components—where the positive component covers the torus and the negative component "scoops out" the central hole.
SGHMC Sampling Optimization: Introducing the Student-t parameter $\nu$ severely intensifies parameter coupling (altering $\nu$ changes the distribution family, which affects the optimal values of $\mu$ and $\Sigma$). SGHMC decouples parameters through a friction term, using adaptive friction and noise scheduling: low-opacity components receive more friction and noise for exploration, while high-opacity components suppress friction for localized search. A burn-in stage conducts broad exploration without friction, followed by a friction-regulated exploitation stage for refinement.
Component Recycling: Low-opacity components are recycled to the positions of high-opacity components, determining the new covariance matrix by minimizing the rendering difference before and after recycling. A closed-form recycling formula under the Student-t distribution is derived (involving Beta functions).

Loss & Training¶

$$L = (1-\epsilon_{D-SSIM})L_1 + \epsilon_{D-SSIM}L_{D-SSIM} + \epsilon_o\sum_i|o_i|_1 + \epsilon_\Sigma\sum_i\sum_j|\lambda_{i,j}|_1$$ - RTX 4090, 45 min training (Mip-NeRF 360), rendering at 71 FPS.

Key Experimental Results¶

Main Results¶

Dataset	Metric	3DGS	3DHGS	MCMC	SSS
Mip-NeRF360	PSNR	28.69	29.56	29.89	29.90
Mip-NeRF360	LPIPS	0.182	0.178	0.190	0.145
T&T	PSNR	23.14	24.49	24.29	24.87
T&T	LPIPS	0.183	0.169	0.190	0.138
Deep Blending	PSNR	29.41	29.76	29.67	30.07

SSS achieves the best performance in 6 out of 9 metrics and the second-best in 2. The improvement in LPIPS is particularly significant (T&T: 0.138 vs. the second-best 0.169, a 24.6% reduction).

Parameter Efficiency (Few Components)¶

On T&T, SSS uses ~300k components (whereas 3DGS uses 1.1–2.6M): - SSS (468k) = 24.4 PSNR $\approx$ 3DHGS (full) = 24.49 PSNR - This represents a component reduction of up to 82%.

Ablation Study¶

Method	T&T PSNR	SSIM	LPIPS
3DGS	23.14	0.841	0.183
SGD + Positive t-dist	23.80	0.838	0.191
SGHMC + Gaussian	24.52	0.869	0.150
SGHMC + Positive t-dist	24.53	0.864	0.155
Full SSS	24.87	0.873	0.138

Every module contributes to the performance: the Student-t distribution enhances expressiveness, SGHMC improves optimization, and negative components provide an additional performance boost.

Highlights & Insights¶

Unification of three innovations: Simultaneous improvements in distribution, density space, and optimization, which mutually reinforce each other—the Student-t distribution is highly flexible but suffers from severe parameter coupling $\to$ SGHMC decouples it $\to$ negative components further enhance performance $\to$ SGHMC handles optimization of negative components.
Mathematical rigor: Closed-form derivations for the projection, integration, and recycling of the Student-t distribution are provided, complete with rigorous mathematical proofs.
Impressive parameter efficiency: Component count is reduced by up to 82% for matching quality, and in certain scenes, it requires less than 2% of the original 3DGS components.
Intuitive explanation of heavy tails: The Cauchy end of the Student-t distribution can cover large homogeneous areas (such as the sky) with a single component, while the Gaussian end precisely fits fine details—switching adaptively across the family.
Torus experiment: Visually illustrates how negative components "scoop out" holes—showing that 1 positive + 1 negative component provides the topological expressiveness of 5 positive-ended components.

Limitations & Future Work¶

Since the Student-t distribution is still smooth and symmetric, representing sharp, irregular shapes remains challenging.
SGHMC introduces multiple hyperparameters (friction coefficient, noise scheduling, and negative component ratio) that demand tuning.
The training time is about twice as slow as 3DGS (45 min vs. 21 min on Mip-NeRF360).
Floating artifacts (a common issue in 3DGS) persist.
Negative components require careful initialization—random initialization can lead to instability. The authors employ a threshold-based strategy that flips low-opacity positive components to negative ones.
Currently restricted to static scenes. Optimizing temporal consistency for the Student-t distribution parameter $\nu$ in dynamic environments remains an open problem.
While memory overhead is largely on par with 3DGS (with only 1 additional scalar $\nu$ per component), SGHMC requires additional storage for momentum terms.

GES: Also changes the distribution (Generalized Exponential) but relies on approximations instead of altering the rasterizer. SSS has precise closed-form solutions and vastly outperforms GES.
3DHGS: Employs semi-Gaussians primarily to improve boundaries. SSS offers a more general solution by learning continuous tail thickness.
3DGS-MCMC: First to introduce SGLD sampling. SSS upgrades this formulation to SGHMC combined with Student-t distributions and negative components, delivering comprehensive improvements.
Inspirations & Connections:
The concept of "learning across a distribution family" via the Student-t distribution can be extended to other kernel-based representations (e.g., SPH, KDE).
The design paradigm of negative density/subtraction can be generalized to other neural rendering tasks.
While 3D-HGS focuses on boundary discontinuities, SSS focuses on distribution family flexibility—presenting two complementary improvement trajectories.
The component recycling mechanism (based on minimizing rendering differences) can be applied to other adaptive density control scenarios.

Rating¶

Novelty: ⭐⭐⭐⭐⭐ Trifold innovations unified in a single framework, featuring mathematically deep derivations and an unprecedented combination of negative densities, Student-t distributions, and SGHMC.
Experimental Thoroughness: ⭐⭐⭐⭐⭐ Standard evaluation across 3 datasets, 11 scenes, and 3 metrics, coupled with comparison curves over 165 different configuration counts, ablations, visualizations, and sampling effect analyses.
Writing Quality: ⭐⭐⭐⭐ Rigorous and complete mathematical proofs, though the manuscript is relatively dense and lengthy.
Value: ⭐⭐⭐⭐⭐ A fundamental advancement of the 3DGS paradigm. Code is open-sourced, serving as a powerful, drop-in replacement for the standard 3DGS base components.