Global and Local Topology-Aware Graph Generation via Dual Conditioning Diffusion¶
Conference: ICLR 2026
OpenReview: https://openreview.net/forum?id=IZV9k5BGxi
Code: To be confirmed
Area: Graph Generation / Latent Diffusion Models
Keywords: Graph Generation, Latent Diffusion, Global-Local Dependencies, Bidirectional Conditioning, Molecule Generation
TL;DR¶
DualDiff decomposes graphs into node-level (local) and cluster-level (global) diffusion branches. By employing a "bidirectional conditioning" mechanism, global and local information alternately serve as conditions during the denoising process. This joint modeling of \(p(Z_l, Z_g)\) in a unified latent space significantly improves the generation quality of both general and molecular graphs.
Background & Motivation¶
- Background: Diffusion models have become the mainstream paradigm for graph generation. Latent space diffusion (e.g., GEOLDM, LGD, EDM-SyCo) further moves the process from the raw graph space to a low-dimensional latent space, balancing efficiency and scalability.
- Limitations of Prior Work: Most existing methods focus on "node-level" generation—treating nodes as independent entities during denoising, which struggles to capture dependencies within and between substructures. A few works incorporating global information (e.g., SubgDiff's subgraph prediction, Graphusion's cluster pseudo-labels) treat it only as one-way guidance, ignoring inter-substructure dependencies and failing to model the joint distribution of global and local features.
- Key Challenge: Graph data naturally exhibits multi-scale coupled dependencies. Molecules contain both local details (functional groups) and global topologies (spatial distribution of groups); these interact with each other. Single-way or single-scale modeling cannot adequately describe both levels simultaneously.
- Goal: Design a unified generative model capable of dynamically capturing both global and local topological information to model their joint distribution.
- Core Idea: "Joint Distribution Bidirectional Decomposition". Based on the identity \(p(Z_l, Z_g) = p(Z_l|Z_g)p(Z_g) = p(Z_g|Z_l)p(Z_l)\), the joint modeling is decomposed into two complementary processes: "global-to-local" and "local-to-global." A bidirectional conditioning mechanism allows the two diffusion branches to alternately condition on each other, achieving dual topology awareness.
Method¶
Overall Architecture¶
DualDiff is a two-stage latent diffusion framework. First, a pre-trained graph autoencoder maps the graph \(G=(H,A)\) to a unified latent space to obtain local representations \(Z_l \in \mathbb{R}^{N \times d}\), then clusters \(Z_l\) to aggregate global representations \(Z_g \in \mathbb{R}^{K \times d}\). Subsequently, a dual-branch diffusion (node-level + cluster-level) is performed in the latent space. A bidirectional conditioning mechanism enables information exchange during denoising. Finally, the decoder reconstructs the graph from the joint \(\hat Z_l, \hat Z_g\).
flowchart LR
G[Input Graph/Molecule G] -->|EGNN/GIN Encoder| Z[Latent Representation Z]
Z --> Zl[Local Repr. Z_l Node-level]
Z -->|Clustering+Pooling GlobalExtraction| Zg[Global Repr. Z_g Cluster-level]
Zl --> DL[Local Denoising Dθl]
Zg --> DG[Global Denoising Dθg]
DG -. Global→Local Condition .-> DL
DL -. Local→Global Condition .-> DG
DL --> Zlh[Denoised Ẑ_l]
DG --> Zgh[Denoised Ẑ_g]
Zlh --> Dec[Decoder Dψ]
Zgh --> Dec
Dec --> Gout[Generated Graph/Molecule]
Key Designs¶
1. Global Information Extraction: Explicitly encoding "substructure topology" into the global branch via clustering. The framework encodes the graph into latent space for node-level \(Z_l\), while \(Z_g\) is derived via clustering—the source of global topology awareness. Different strategies are used: Molecule graphs use K-means in atomic coordinate space for geometrically enhanced labels; general graphs use spectral clustering of Graph Laplacian eigenvectors for community partitioning. Given the assignment matrix \(S_g \in \{0,1\}^{N \times K}\), cluster-level embeddings are obtained via \(Z_g = \mathrm{Pooling}(S_g, Z) \in \mathbb{R}^{K \times d}\). Thus, \(Z_g\) carries long-range dependencies such as "which nodes belong to the same substructure" and "how substructures are distributed."
2. Dual-Branch Diffusion: SDEs for node-level and cluster-level branches. Given the different topological characteristics, DualDiff defines separate forward SDEs for \(Z_l\) and \(Z_g\) in the latent space: \(\mathrm{d}Z_{l,t}=f_{l,t}\mathrm{d}t+s_{l,t}\mathrm{d}W_{l,t}\) and \(\mathrm{d}Z_{g,t}=f_{g,t}\mathrm{d}t+s_{g,t}\mathrm{d}W_{g,t}\). Within the EDM framework, the drift term is set to 0 and the diffusion term \(s_{l,t}=s_{g,t}=\sqrt{2t}\). Two GNN denoising networks \(D_{\theta_l}, D_{\theta_g}\) are trained to recover clean latent representations: \(\mathbb{E}[\|D_{\theta_l}(\tilde Z_l,\sigma)-Z_{l,0}\|^2 + \|D_{\theta_g}(\tilde Z_g,\sigma)-Z_{g,0}\|^2]\).
3. Bidirectional Conditioning: Approximating the joint distribution via alternate conditioning. This is the core contribution. Based on the decomposition \(p(Z_l,Z_g)=p(Z_l|Z_g)p(Z_g)=p(Z_g|Z_l)p(Z_l)\), two complementary processes are defined: (i) Global-to-local (\(p(Z_l|Z_g)\)) and (ii) Local-to-global (\(p(Z_g|Z_l)\)). Practically, self-conditioning provides the previous step's predictions \(\hat Z_{l,0}, \hat Z_{g,0}\). The model switches between processes with probability \(p\): Process (i) uses \((C_l,C_g)=((\hat Z_{l,0},\hat Z_{g,0}),0)\), while process (ii) uses \((0,(\hat Z_{l,0},\hat Z_{g,0}))\). Process (i) uses FiLM-inspired modulation: \(\hat Z_{g,0}\) generates cluster-specific scaling/shifting parameters \(\gamma_i, \beta_i\) to modulate \(Z_l\) based on node-cluster similarity. Process (ii) compresses local details into a global condition using message passing and pooling: \(C=\mathrm{Linear}(\mathrm{Pool}(\mathrm{MP}(\hat Z_{l,0})))\).
4. Alternating Sampling: Stabilizing generation via a "server-client" approach from Federated Learning. Inspired by central servers aggregating client updates, global clusters are treated as servers and nodes as clients. Process (i)/(ii) correspond to local updates and global aggregation, respectively. During sampling, process (ii) is triggered only every \(m\) steps of process (i) (e.g., when t % (m+1) == 0). This scheduling significantly improves stability and quality.
Key Experimental Results¶
Main Results¶
General Graph Generation (Planar / SBM, lower MMD is better, higher V.U.N. is better):
| Model | Planar Clus.↓ | Planar Spec.↓ | Planar V.U.N.↑ | SBM Deg.↓ | SBM Spec.↓ |
|---|---|---|---|---|---|
| DiGress | 0.0372 | 0.0106 | 75.0 | 0.0013 | 0.0400 |
| GruM | 0.0353 | 0.0062 | 90.0 | 0.0007 | 0.0050 |
| GraphBFN | 0.0294 | 0.0046 | 96.7 | 0.0005 | 0.0053 |
| DualDiff | 0.0275 | 0.0038 | 97.5 | 0.0004 | 0.0042 |
Molecule Generation (ZINC250k, higher FCD / KL is better):
| Method | FCD↑ | KL↑ | Novelty↑ | Validity↑ |
|---|---|---|---|---|
| DiGress | 0.65 | 0.91 | 0.99 | 0.85 |
| GruM | 0.64 | N.A. | 1.00 | 0.99 |
| EDM-SyCo | 0.85 | 0.96 | 1.00 | 0.88 |
| DualDiff | 0.91 | 0.98 | 1.00 | 0.92 |
3D Molecule Generation (QM9): DualDiff achieves 99.3% on Valid & Unique, outperforming GEOLDM (92.7%) and EQUIFM (93.5%).
Ablation Study¶
Ablation of Bidirectional Conditioning (ZINC250k):
| Configuration | FCD↑ | KL↑ |
|---|---|---|
| No conditioning | 0.65 | 0.82 |
| Self-conditioning | 0.72 | 0.89 |
| Only \(Z_l \to Z_g\) | 0.75 | 0.95 |
| Only \(Z_g \to Z_l\) | 0.83 | 0.95 |
| Bidirectional \(Z_g \leftrightarrow Z_l\) | 0.91 | 0.98 |
Key Findings¶
- Bidirectional exceeds self-conditioning: Improving FCD from 0.72 to 0.91 proves that global-local interaction is the primary source of quality gain.
- Directional value: \(Z_g \to Z_l\) (global guiding local) is more effective than \(Z_l \to Z_g\), though bidirectional is needed to approximate the joint distribution.
- Superiority over hierarchical methods: Compared to autoregressive/coarse-to-fine methods (HiGen, PPGN), DualDiff's dynamic interaction captures the joint distribution more effectively.
- Optimization of parameters: Moderate values for \(p\) and larger \(m\) (focusing on local details during sampling) yield better performance with a small cluster number \(K\).
- Efficiency: Competitive results are achieved in ~200 steps with manageable overhead from bidirectional conditioning.
Highlights & Insights¶
- Probabilistic grounding: The entire design is derived from the expansion of \(p(Z_l, Z_g)\), giving the bidirectional mechanism a clear probabilistic interpretation rather than being just an engineering heuristic.
- Reusing self-conditioning semantics: The "zeroing out" of one branch during training aligns with the robustness design of self-conditioning, allowing seamless integration.
- Federated Learning analogy: Mapping global clusters to "servers" and nodes to "clients" provides an intuitive rationale for the \(m:1\) sampling schedule.
- Generality: The framework is versatile, compatible with EGNN (for SE(3) equivariance in 3D molecules) or GIN/GCN (general graphs).
Limitations & Future Work¶
- Dependence on clustering: Global information relies on K-means or spectral clustering; poor clustering could negatively impact the global branch.
- Hyperparameter sensitivity: \(p, m, K\) require tuning for different datasets.
- Validity gap: While FCD/KL are high, Validity on ZINC250k (0.92) is still lower than GruM (0.99) or MoLeR (1.00).
- Two-stage training: The latent space is fixed after autoencoder pre-training; end-to-end optimization remains unexplored.
Related Work & Insights¶
- Latent Graph Diffusion (GEOLDM / EDM-SyCo): DualDiff extends the "autoencode-then-diffuse" paradigm from a single branch to dual branches.
- Global-informed generation (SubgDiff / Graphusion): These use one-way guidance; DualDiff addresses the lack of joint distribution modeling.
- Self-Conditioning: DualDiff upgrades single-scale self-conditioning to a cross-scale bidirectional mechanism.
- FiLM (Feature Modulation): Used as a tool to inject cluster-level global information into local nodes.
Rating¶
- Novelty: ⭐⭐⭐⭐ — Clear probabilistic derivation for dual-branch interaction.
- Experimental Thoroughness: ⭐⭐⭐⭐ — Comprehensive testing across 8 datasets including general and 2D/3D molecular graphs.
- Writing Quality: ⭐⭐⭐⭐ — Logical flow from motivation to sampling schedules.
- Value: ⭐⭐⭐⭐ — A generalizable multi-scale modeling paradigm for complex graph structures.