Skip to content

SVG-Head: Hybrid Surface-Volumetric Gaussians for High-Fidelity Head Reconstruction and Real-Time Editing

Conference: ICCV 2025 arXiv: 2508.09597 Code: Project Page Area: 3D Vision Keywords: head reconstruction, 3D Gaussian, texture editing, FLAME mesh, surface-volumetric hybrid

TL;DR

SVG-Head is proposed as a hybrid representation combining surface Gaussians (with explicit texture maps) and volumetric Gaussians (for supplementary modeling of non-Lambertian regions), achieving, for the first time, real-time appearance editing of high-fidelity Gaussian head avatars.

Background & Motivation

Creating high-fidelity and editable head avatars faces fundamental challenges:

NeRF implicit representations: inherently difficult to edit.

3DGS color binding: each Gaussian stores color independently, lacking global appearance disentanglement, which precludes real-time texture editing.

Limitations of prior editing methods: GaussianAvatar-Editor requires text-to-image model assistance; MeGA requires minutes of optimization per edit.

The key insight of SVG-Head is to disentangle global appearance into a learnable texture map via surface Gaussians, enabling real-time editing.

Method

Surface Gaussians (surf-GS)

  • Constrained to lie on FLAME mesh faces, localized via barycentric coordinates.
  • Mesh-aware Gaussian UV mapping: projects 3D positions into FLAME UV space.

    • Projects ray–Gaussian intersection points onto the associated triangular face.
    • UV coordinates are obtained by barycentric interpolation, simplified to a single affine transformation: $\(\phi(I(\mathbf{r}_p, \mathcal{G}_i)) = \phi(\boldsymbol{\mu}_i) + T(\boldsymbol{\mu}_i)(I(\mathbf{r}_p, \mathcal{G}_i) - \boldsymbol{\mu}_i)\)$
  • Consistent UV coordinates: Gaussian centers are constrained to mesh faces via \(\boldsymbol{\mu}_i = \xi_A \mathbf{v}_A + \xi_B \mathbf{v}_B + \xi_C \mathbf{v}_C\), with rotation aligned to the surface normal, ensuring a unique UV per pixel.

  • Dynamic texture: \(\mathcal{T} = \mathcal{T}_{\text{diff}} + \mathcal{T}_{\text{dy}}\), where the dynamic component is generated by a convolutional network conditioned on expression parameters \(\psi\).

Volumetric Gaussians (vol-GS)

Surface Gaussians are constrained to the FLAME surface and thus insufficient for modeling non-Lambertian regions such as lips and hair. Volumetric Gaussians allow: - Free movement in the vicinity of the mesh. - Independent color storage (not sampled from the texture map). - Retention of FLAME binding to support animation.

Differentiable Hybrid Rendering

Color computation is unified across both Gaussian types:

\[\mathcal{C}(\mathcal{G}_i, \mathbf{r}_p) = \begin{cases} \mathbf{c}_i^{\text{SH}} & \text{if } v_i = 1 \text{ (vol-GS)} \\ h(\phi(I(\mathbf{p}, \mathcal{T}_{\text{dy}} + \mathcal{T}_{\text{diff}})) + \mathbf{c}_i^{\text{SH}_{\text{res}}} & \text{if } v_i = 0 \text{ (surf-GS)} \end{cases}\]

Hierarchical Optimization Strategy

Stage 1: Optimize surface Gaussians and FLAME parameters only. - Photometric loss \(\mathcal{L}_{\text{rgb}}\) (L1 + D-SSIM) - Diffuse photometric loss \(\mathcal{L}_{\text{rgb}}^{\text{diff}}\) (texture map regularization) - Scale loss \(\mathcal{L}_{\text{scale}}\)

Stage 2: Joint optimization of both Gaussian types. - Most surf-GS parameters are frozen (only \(o^{(s)}\) and \(\mathcal{T}_{\text{dy}}\) are optimized). - Position regularization \(\mathcal{L}_{\text{pos}}\) and alpha regularization \(\mathcal{L}_a = \|\mathbf{A_s} - 1\|_2^2\) are added.

Key Experimental Results

NeRSemble Dataset

Method Editable NVS-PSNR↑ NVS-SSIM↑ NVS-LPIPS↓
PointAvatar 25.8 0.893 0.097
Gaussian Head Avatar 29.5 0.894 0.084
GaussianAvatars 31.6 0.938 0.065
MeGA Slow 32.0 0.940 0.062
SVG-Head Real-time 31.8 0.939 0.063

SVG-Head achieves the best metrics among editable methods and is on par with the non-editable MeGA, while supporting real-time editing.

Editing Capability Comparison

Method Editing Paradigm Time
MeGA Optimization Minutes to hours
GaussianAvatar-Editor Text-to-image Non-real-time
SVG-Head Direct texture editing Real-time

Highlights & Insights

  1. First Gaussian head avatar with explicit texture maps: enables real-time appearance editing of 3DGS-based head avatars.
  2. UV consistency guarantee: constraining Gaussians to mesh faces combined with normal-aligned rotation resolves texture blurring.
  3. Complementary design: surf-GS handles global appearance editing; vol-GS supplements modeling of complex regions.
  4. Hierarchical optimization avoids coupling: surf-GS is first optimized independently to obtain clean textures, followed by joint optimization.

Limitations & Future Work

  • Texture resolution is bounded by the size of the UV map.
  • The volumetric Gaussian component does not support texture-based editing.
  • Performance depends on the quality of FLAME tracking.
  • Dynamic texture generation for extreme expressions may be insufficiently precise.
  • GaussianAvatars: FLAME-bound 3DGS.
  • MeGA: mesh–Gaussian hybrid with slow editing.
  • Texture-GS: textured Gaussians for static scenes.

Rating

  • Novelty: ⭐⭐⭐⭐⭐ (first Gaussian head avatar supporting real-time texture editing)
  • Technical Depth: ⭐⭐⭐⭐⭐ (UV mapping + hybrid rendering + hierarchical optimization)
  • Experimental Thoroughness: ⭐⭐⭐⭐ (reconstruction + editing + ablation)
  • Value: ⭐⭐⭐⭐⭐ (real-time editing is a killer feature)