SVG-Head: Hybrid Surface-Volumetric Gaussians for High-Fidelity Head Reconstruction and Real-Time Editing¶
Conference: ICCV 2025 arXiv: 2508.09597 Code: Project Page Area: 3D Vision Keywords: head reconstruction, 3D Gaussian, texture editing, FLAME mesh, surface-volumetric hybrid
TL;DR¶
SVG-Head is proposed as a hybrid representation combining surface Gaussians (with explicit texture maps) and volumetric Gaussians (for supplementary modeling of non-Lambertian regions), achieving, for the first time, real-time appearance editing of high-fidelity Gaussian head avatars.
Background & Motivation¶
Creating high-fidelity and editable head avatars faces fundamental challenges:
NeRF implicit representations: inherently difficult to edit.
3DGS color binding: each Gaussian stores color independently, lacking global appearance disentanglement, which precludes real-time texture editing.
Limitations of prior editing methods: GaussianAvatar-Editor requires text-to-image model assistance; MeGA requires minutes of optimization per edit.
The key insight of SVG-Head is to disentangle global appearance into a learnable texture map via surface Gaussians, enabling real-time editing.
Method¶
Surface Gaussians (surf-GS)¶
- Constrained to lie on FLAME mesh faces, localized via barycentric coordinates.
-
Mesh-aware Gaussian UV mapping: projects 3D positions into FLAME UV space.
- Projects ray–Gaussian intersection points onto the associated triangular face.
- UV coordinates are obtained by barycentric interpolation, simplified to a single affine transformation: $\(\phi(I(\mathbf{r}_p, \mathcal{G}_i)) = \phi(\boldsymbol{\mu}_i) + T(\boldsymbol{\mu}_i)(I(\mathbf{r}_p, \mathcal{G}_i) - \boldsymbol{\mu}_i)\)$
-
Consistent UV coordinates: Gaussian centers are constrained to mesh faces via \(\boldsymbol{\mu}_i = \xi_A \mathbf{v}_A + \xi_B \mathbf{v}_B + \xi_C \mathbf{v}_C\), with rotation aligned to the surface normal, ensuring a unique UV per pixel.
-
Dynamic texture: \(\mathcal{T} = \mathcal{T}_{\text{diff}} + \mathcal{T}_{\text{dy}}\), where the dynamic component is generated by a convolutional network conditioned on expression parameters \(\psi\).
Volumetric Gaussians (vol-GS)¶
Surface Gaussians are constrained to the FLAME surface and thus insufficient for modeling non-Lambertian regions such as lips and hair. Volumetric Gaussians allow: - Free movement in the vicinity of the mesh. - Independent color storage (not sampled from the texture map). - Retention of FLAME binding to support animation.
Differentiable Hybrid Rendering¶
Color computation is unified across both Gaussian types:
Hierarchical Optimization Strategy¶
Stage 1: Optimize surface Gaussians and FLAME parameters only. - Photometric loss \(\mathcal{L}_{\text{rgb}}\) (L1 + D-SSIM) - Diffuse photometric loss \(\mathcal{L}_{\text{rgb}}^{\text{diff}}\) (texture map regularization) - Scale loss \(\mathcal{L}_{\text{scale}}\)
Stage 2: Joint optimization of both Gaussian types. - Most surf-GS parameters are frozen (only \(o^{(s)}\) and \(\mathcal{T}_{\text{dy}}\) are optimized). - Position regularization \(\mathcal{L}_{\text{pos}}\) and alpha regularization \(\mathcal{L}_a = \|\mathbf{A_s} - 1\|_2^2\) are added.
Key Experimental Results¶
NeRSemble Dataset¶
| Method | Editable | NVS-PSNR↑ | NVS-SSIM↑ | NVS-LPIPS↓ |
|---|---|---|---|---|
| PointAvatar | ✗ | 25.8 | 0.893 | 0.097 |
| Gaussian Head Avatar | ✗ | 29.5 | 0.894 | 0.084 |
| GaussianAvatars | ✗ | 31.6 | 0.938 | 0.065 |
| MeGA | Slow | 32.0 | 0.940 | 0.062 |
| SVG-Head | Real-time | 31.8 | 0.939 | 0.063 |
SVG-Head achieves the best metrics among editable methods and is on par with the non-editable MeGA, while supporting real-time editing.
Editing Capability Comparison¶
| Method | Editing Paradigm | Time |
|---|---|---|
| MeGA | Optimization | Minutes to hours |
| GaussianAvatar-Editor | Text-to-image | Non-real-time |
| SVG-Head | Direct texture editing | Real-time |
Highlights & Insights¶
- First Gaussian head avatar with explicit texture maps: enables real-time appearance editing of 3DGS-based head avatars.
- UV consistency guarantee: constraining Gaussians to mesh faces combined with normal-aligned rotation resolves texture blurring.
- Complementary design: surf-GS handles global appearance editing; vol-GS supplements modeling of complex regions.
- Hierarchical optimization avoids coupling: surf-GS is first optimized independently to obtain clean textures, followed by joint optimization.
Limitations & Future Work¶
- Texture resolution is bounded by the size of the UV map.
- The volumetric Gaussian component does not support texture-based editing.
- Performance depends on the quality of FLAME tracking.
- Dynamic texture generation for extreme expressions may be insufficiently precise.
Related Work & Insights¶
- GaussianAvatars: FLAME-bound 3DGS.
- MeGA: mesh–Gaussian hybrid with slow editing.
- Texture-GS: textured Gaussians for static scenes.
Rating¶
- Novelty: ⭐⭐⭐⭐⭐ (first Gaussian head avatar supporting real-time texture editing)
- Technical Depth: ⭐⭐⭐⭐⭐ (UV mapping + hybrid rendering + hierarchical optimization)
- Experimental Thoroughness: ⭐⭐⭐⭐ (reconstruction + editing + ablation)
- Value: ⭐⭐⭐⭐⭐ (real-time editing is a killer feature)