GGBall: Graph Generative Model on Poincaré Ball¶

Conference: ICLR 2026 arXiv: 2506.07198 Code: GitHub Area: Graph Generation / Hyperbolic Geometry Keywords: Hyperbolic Space, Graph Generation, Poincaré Ball Model, Vector Quantization, Flow Matching

TL;DR¶

This paper proposes GGBall, the first graph generation framework operating entirely on the Poincaré ball model. By combining a hyperbolic vector-quantized variational autoencoder (HVQVAE) with a Riemannian flow matching prior, GGBall achieves state-of-the-art performance on both hierarchical and molecular graph generation, reducing the average generation error by 18% on hierarchical graph benchmarks.

Background & Motivation¶

Background: Graph generation is a core task in molecular design, materials discovery, and related domains. Existing methods (e.g., DiGress, GDSS) primarily operate in Euclidean space or discrete graph space.
Limitations of Prior Work: Euclidean latent spaces are inherently ill-suited for capturing hierarchical structures and power-law degree distributions in graph data, leading to distortion of community structures and parent–child relationships.
Key Challenge: There is a geometric mismatch between the combinatorial, hierarchical nature of graphs and the linearly growing volume of Euclidean space.
Key Insight: The exponential volume growth of hyperbolic space naturally accommodates hierarchical structures (Gromov's theorem).
Core Idea: The standard Euclidean latent-space generation pipeline is fully reformulated in hyperbolic space, representing graph topology uniformly via node-level latent variables.

Method¶

Overall Architecture¶

The framework adopts a two-stage generation process: (1) graph structures are encoded into discrete latent tokens on the Poincaré ball via HVQVAE; (2) a Riemannian flow matching model captures the latent prior, and at generation time samples are drawn from the prior, quantized, and decoded. A GNN encodes local structure, while a Transformer propagates global dependencies.

Key Designs¶

Poincaré Graph Neural Network (Poincaré GNN):
- Function: Performs message passing in hyperbolic space to encode edge and node information into node representations.
- Mechanism: Tangent-space aggregation with distance-modulated message functions. \(\log_0^c(\cdot)\) / \(\exp_0^c(\cdot)\) are used to map representations to the tangent space for aggregation and back to the manifold.
- Message modulation: \(\text{M}(\mathbf{m}_{ij}) = \gamma_{ij} \cdot \mathbf{m}_{ij} + \beta_{ij}\), where \(\gamma_{ij}, \beta_{ij}\) are functions of the hyperbolic distance \(d_c(\mathbf{h}_i, \mathbf{h}_j)\).
- Design Motivation: Curvature-aware distance modulation enables the model to directly encode the strength of hierarchical relationships.
Poincaré Diffusion Transformer:
- Function: Models global graph structure by replacing dot-product attention with geodesic distance attention.
- Geodesic attention: \(\alpha_{ij} \propto \exp(-\tau d_c(\mathbf{q}_i, \mathbf{k}_j))\)
- Value aggregation uses the Möbius gyromidpoint to maintain geometric consistency.
- Time modulation: Timestep embeddings are injected to support the flow matching prior.
Hyperbolic Vector-Quantized Variational Autoencoder (HVQVAE):
- Function: Discretizes continuous hyperbolic embeddings into a learnable Poincaré codebook \(\mathcal{C}\).
- Mechanism: Nearest-neighbor quantization based on geodesic distance: \(\mathbf{z}_q = \arg\min_{\mathbf{c}_j} d_c(\mathbf{z}, \mathbf{c}_j)\).
- Codebook entries are initialized via hyperbolic \(k\)-means clustering and updated with a Riemannian optimizer.
- Stability mechanism: Inactive codebook entries are replaced via an expiry threshold; weighted Einstein midpoint updates are applied.

Loss & Training¶

Autoencoder loss: reconstruction loss + degree–edge consistency loss + L2 regularization.
HVQVAE loss: \(\mathcal{L}_{\text{HVQVAE}} = \lambda_1 \mathcal{L}_{\text{AE}} + \lambda_2 \mathbb{E}[d_c^2(\text{sg}(\mathbf{z}_q), \mathbf{z})] + \lambda_3 \mathbb{E}[d_c^2(\mathbf{z}_q, \text{sg}(\mathbf{z}))]\)
Flow matching prior: Riemannian conditional flow matching objective with geodesic interpolation paths.

Key Experimental Results¶

Main Results: Abstract Graph Generation¶

Method	Community-small Avg↓	Ego-small Avg↓	Space
GraphVAE	0.6233	0.1167	Euclidean
GDSS	0.0460	0.0173	Graph
DiGress	0.0380	-	Graph
HGDM	0.0240	0.0137	Hybrid
GGBall	0.0197	0.0117	Hyperbolic

Molecular Graph Generation (QM9)¶

Method	Validity↑	Uniqueness↑	Novelty↑	V.U.N↑
DiGress	99.00	96.34	32.51	31.04
CatFlow	98.47	97.58	65.62	63.02
GGBall	98.33	96.38	93.77	88.45

Key Findings¶

HVQVAE significantly improves chemical validity (95.18→99.14%) and edge precision over HAE.
The hyperbolic encoder achieves 4× lower MMD error in degree distribution preservation compared to Euclidean baselines.
Novelty of 93.77% substantially outperforms DiGress (32.51%), indicating stronger expressiveness of hyperbolic space.

Highlights & Insights¶

GGBall is the first graph generation framework operating entirely on the Poincaré ball, unifying the encode–quantize–prior–decode pipeline end-to-end in hyperbolic space.
Node-level latent variables provide a unified representation of graph topology, treating edge connectivity as an emergent property of latent-space geometry.
GGBall achieves the highest V.U.N score on QM9, a composite metric combining novelty, uniqueness, and validity.

Limitations & Future Work¶

Autoregressive priors underperform flow matching in preliminary experiments; further exploration is warranted.
Validity on molecular graphs is slightly below DiGress (98.33 vs. 99.00); tighter integration of chemical constraints remains an open challenge.
Hyperbolic space operations introduce higher computational complexity compared to their Euclidean counterparts.

vs. HGDM: HGDM performs edge denoising only on Poincaré embeddings while still operating in discrete graph space; GGBall operates entirely within the hyperbolic latent space.
vs. DiGress: DiGress iteratively denoises in graph space and is constrained by Euclidean geometry; GGBall leverages hyperbolic latent variables to disentangle hierarchical structure.

Rating¶

Novelty: ⭐⭐⭐⭐⭐ — First fully hyperbolic graph generation framework, with hyperbolic components throughout: GNN, Transformer, VQ, and flow matching.
Experimental Thoroughness: ⭐⭐⭐⭐ — Covers both abstract and molecular graphs with thorough ablations; large-scale graph benchmarks are lacking.
Writing Quality: ⭐⭐⭐⭐ — Mathematically rigorous with a clear presentation.
Value: ⭐⭐⭐⭐ — Opens a new direction for non-Euclidean geometry in generative modeling.