Geometry Distributions¶

Conference: ICCV 2025 arXiv: 2411.16076 Code: Not released Area: 3D Vision Keywords: Geometry representation, diffusion models, surface point distributions, neural compression, non-watertight meshes

TL;DR¶

This paper proposes Geometry Distributions (GeomDist), which models 3D geometry as a probability distribution over surface points and learns it via a diffusion model. Without assuming genus, connectivity, or boundary conditions, the method samples arbitrarily many surface points from Gaussian noise to represent geometry of arbitrary topology.

Background & Motivation¶

Existing 3D geometry representations each suffer from inherent limitations: - Meshes: Inconsistent data structures, ill-suited for learning - Voxels: Memory-intensive, demanding high resolution - Point clouds: Finite sampling, lacking connectivity information - SDFs: Unable to represent thin structures and non-watertight geometry

The core insight is that any surface can be approximated by a sufficiently large set of sampled points, and generative models can theoretically sample unlimited data from a distribution. Accordingly, geometry is modeled as a distribution \(\Phi_{\mathcal{M}}\) over surface points such that \(\mathbf{x} \sim \Phi_{\mathcal{M}} \Rightarrow \mathbf{x} \in \mathcal{M}\).

Method¶

Problem Formulation¶

Given a surface \(\mathcal{M} \subset \mathbb{R}^3\), the goal is to learn a mapping \(\mathcal{E}\) from a Gaussian distribution to the surface point distribution. This mapping is learned via a diffusion model \(D_\theta(\cdot, \cdot)\) satisfying the ODE:

\[\mathrm{d}\mathbf{x} = \frac{\mathbf{x} - D_\theta(\mathbf{x}, t)}{t} \mathrm{d}t\]

Forward Sampling (Gaussian → Surface)¶

Starting from Gaussian noise \(\mathbf{x}_0 = T\mathbf{n}\), points are iteratively updated as:

\[\mathbf{x}_{i+1} = \mathbf{x}_i + (t_{i+1} - t_i) \cdot \frac{\mathbf{x}_i - D_\theta(\mathbf{x}_i, t_i)}{t_i}\]

The endpoint \(\mathbf{x}_N\) lies on the target surface. Sampling arbitrarily many Gaussian points allows the surface to be approximated to arbitrary precision.

Inverse Sampling (Surface → Gaussian)¶

Starting from a surface point, the trajectory is traversed in reverse to map back to noise space:

\[\mathbf{x}_{i-1} = \mathbf{x}_i + (t_{i-1} - t_i) \cdot \frac{\mathbf{x}_i - D_\theta(\mathbf{x}_i, t_i)}{t_i}\]

This establishes a bijective correspondence between surface points and noise space.

Training¶

A key design choice is re-sampling \(2^{25}\) surface points at each epoch. After 1000 epochs, the network has observed sufficiently many surface points to simulate infinite sampling:

\[\arg\min_\theta \mathbb{E}_{\mathbf{x} \in \mathcal{M}} \mathbb{E}_{\mathbf{n} \sim \mathcal{N}} \mathbb{E}_{\sigma > 0} \|D_\theta(\mathbf{x} + \sigma\mathbf{n}, \sigma) - \mathbf{x}\|\]

Network Architecture¶

Inspired by EDM, the network employs magnitude-preserving layer designs, consisting of 6 blocks with \(C=512\) linear layers and a total of 5.53M parameters. Inputs and outputs are normalized to zero mean and unit variance.

Key Experimental Results¶

Comparison with SDF on Non-Watertight Objects¶

Method	Parameters	Non-Watertight	Thin Structures
SDF (Instant-NGP)	14M	✗	Poor
GeomDist	5M	✓	Good

GeomDist represents open and non-watertight geometry that SDF cannot handle, using fewer parameters.

Comparison with Vector Field Methods¶

Method	Chamfer Distance (×10³)	Uniformity
Vector field	4.886	Non-uniform
GeomDist	3.218	Uniform

GeomDist outperforms vector field methods in both uniformity and geometric fidelity.

Multi-Resolution Sampling¶

Sampling at varying resolutions from \(n=2^{15}\) to \(n=2^{19}\) on the Wukong mesh consistently yields accurate surface approximations, demonstrating continuous resolution adaptability.

Highlights & Insights¶

Universal representation: No assumptions on genus, watertightness, or connectivity — a truly general geometric representation
Infinite resolution: Theoretically unlimited point sampling, unconstrained by fixed sampling density
Compactness: Complex geometry encoded in 5M parameters, far fewer than the 14M required by SDF
Invertibility: Forward and inverse sampling share the same trajectory, establishing a bijection between surface points and noise space
Broad applicability: Supports textured mesh representation, neural compression, dynamic modeling, and Gaussian splatting rendering

Limitations & Future Work¶

Training is time-consuming (several hours), unsuitable for real-time applications
Each object requires training a separate network
Surface extraction relies on post-processing (e.g., Ball Pivoting for connectivity)
Scalability to large-scale scenes has not been demonstrated

SDF/UDF: Coordinate-based neural representations
Point-E, NeuralPoints: Point cloud generation
EDM: Diffusion model framework

Rating¶

Novelty: ⭐⭐⭐⭐⭐ (A fundamentally new perspective: geometry as distribution)
Technical Depth: ⭐⭐⭐⭐⭐ (ODE framework + network design + training strategy)
Experimental Thoroughness: ⭐⭐⭐⭐ (Multiple object categories + ablations + applications)
Value: ⭐⭐⭐⭐ (General representation with broad potential)