Exploring Variational Graph Autoencoders for Distribution Grid Data Generation¶

Conference: NeurIPS 2025 arXiv: 2509.02469 Code: GitHub Area: Graph Generation / Energy Systems Keywords: Variational Graph Autoencoder, Distribution Grid Generation, Synthetic Graph Data, Graph Neural Networks, Energy AI

TL;DR¶

This paper systematically evaluates four variational graph autoencoder (VGAE) decoder architectures for synthesizing distribution grid topologies. The Iterative-GCN decoder is found to adequately reproduce structural and spectral characteristics of real grids on small, homogeneous datasets; however, on large, heterogeneous datasets, all methods exhibit critical failure modes including disconnected components and repetitive patterns.

Background & Motivation¶

Background: The rapid proliferation of distributed energy resources (PV, batteries, EVs) poses voltage stability and operational challenges in distribution grids, creating urgent demand for data-driven methods to support grid planning and control. However, utilities are reluctant to share real grid data due to security and privacy concerns, severely impeding the application of machine learning in this domain.

Limitations of Prior Work: Existing synthetic grid generation relies primarily on statistical models and heuristic optimization algorithms, both of which introduce excessive simplifying assumptions and fail to capture the diversity, dynamics, and complex interdependencies of real grids. Existing ML-based approaches (e.g., FeederGAN, DeepGDL) suffer from limited generalizability and lack of open-source code, resulting in an absence of reproducible baselines.

Key Challenge: The unavailability of real distribution grid data conflicts with the need for large, diverse training datasets in downstream ML research. A generative model is needed that can produce structurally realistic synthetic grids while adapting to varying scales and topologies.

Goal: (1) Systematically evaluate the feasibility of the VGAE framework for distribution grid topology generation; (2) compare the expressive capacity of different decoders; (3) identify generalization bottlenecks when scaling from small benchmarks to large-scale real-world grids.

Key Insight: VGAE is selected as the backbone framework due to its probabilistic latent space and modular decoder design. By comparing four decoders across two datasets with markedly different structural characteristics, the study quantitatively delineates the upper and lower bounds of generation quality.

Core Idea: Employ VGAE with multiple GNN-based decoders to generate synthetic distribution grid topologies, and expose their limitations in realistic scenarios through structural and spectral evaluation metrics.

Method¶

Overall Architecture¶

The inputs consist of adjacency matrices and node features from real distribution grids, which are mapped to a latent space via a graph neural network encoder. Latent samples are decoded to reconstruct adjacency matrices. The training objective is the variational loss comprising reconstruction accuracy and KL divergence regularization. At inference, latent variables are sampled from the prior distribution and decoded to produce new adjacency matrices.

Key Designs¶

Comparison of Four Decoder Architectures:
- Function: Reconstruct or generate graph adjacency matrices from latent representations.
- Mechanism: (a) Inner Product decoder — directly predicts edge probabilities via \(ZZ^T\); simplest but least expressive. (b) MLP decoder — applies a multilayer perceptron to concatenated node-pair representations. (c) GCN decoder — iteratively updates node representations via graph convolutional networks before predicting edges. (d) Iterative-GCN decoder — augments the GCN decoder with iterative refinement loops to progressively sparsify the generated graph.
- Design Motivation: Progressively increasing decoder expressiveness from simple to complex validates the minimum model complexity required for grid generation. The refinement mechanism in Iterative-GCN suppresses overly dense structures.
Dual-Dataset Evaluation Strategy:
- Function: Assess model generalizability from benchmark to real-world settings.
- Mechanism: The ENGAGE dataset comprises small-scale, highly homogeneous medium- and low-voltage grids (based on SimBench) with small, structurally uniform graphs; the DINGO dataset comprises large-scale, highly heterogeneous medium-voltage feeders with 4,500–7,000 nodes and diverse topologies.
- Design Motivation: Validation solely on small benchmarks overestimates model capability; introducing DINGO exposes generalization bottlenecks in realistic scenarios.
Joint Structural and Spectral Evaluation Metrics:
- Function: Comprehensively quantify the similarity between synthetic and real graphs.
- Mechanism: Mean node degree captures local connectivity characteristics; the normalized Laplacian spectrum captures global structural properties; the one-dimensional Wasserstein distance quantifies distributional discrepancy between the two.
- Design Motivation: Relying solely on degree distributions overlooks global topological differences (e.g., connectivity), whereas spectral metrics can detect disconnected components and repetitive patterns.

Loss & Training¶

The standard VGAE variational objective is employed: a reconstruction loss (cross-entropy measuring adjacency matrix reconstruction accuracy) plus a KL divergence regularization term (constraining the latent distribution toward a standard Gaussian prior).

Key Experimental Results¶

Main Results¶

Dataset	Metric	Real Grid (Mean±Std)	Synthetic Grid (Iterative-GCN, Mean±Std)	Wasserstein Distance
ENGAGE	Mean Node Degree	2.05±0.09	2.07±0.31	—
ENGAGE	Normalized Laplacian Spectrum	—	—	0.10
DINGO	Mean Node Degree	2.00±0.01	2.53±1.47	—
DINGO	Normalized Laplacian Spectrum	—	—	0.51

Ablation Study¶

Decoder	ENGAGE Performance	DINGO Performance	Notes
Inner Product	Training diverges	Training diverges	Insufficient expressiveness
MLP	KL converges but poor reconstruction	Poor reconstruction	Limited expressiveness
GCN	Good	Moderate	Basic GNN decoding
Iterative-GCN	Best	Still problematic	Iterative refinement effective

Key Findings¶

Large Gap Between ENGAGE and DINGO: On the small, homogeneous ENGAGE dataset, the Iterative-GCN achieves a Wasserstein distance of only 0.10; this rises sharply to 0.51 on the large, heterogeneous DINGO dataset, demonstrating that VGAE is effective on simple distributions but cannot scale to realistic complex settings.
Two Failure Modes on DINGO Synthetic Graphs: (1) An excess of near-zero Laplacian eigenvalues, indicating weakly connected or disconnected components; (2) over-concentration of mid-frequency eigenvalues, indicating artificially repetitive structural patterns.
Complete Failure of Simple Decoders: The Inner Product and MLP decoders reveal insufficient expressiveness already at the training stage, failing to capture the structural complexity of distribution grids.

Highlights & Insights¶

Honestly exposing methodological limitations constitutes the primary contribution of this work. While most studies report only favorable settings, the failure analysis on DINGO provides a clear delineation of VGAE's capability boundary and offers strong guidance for future research directions.
Spectral analysis as a diagnostic tool for failure modes: Identifying disconnected components and repetitive patterns via the Laplacian spectrum represents a transferable diagnostic methodology applicable to quality assessment in other graph generation tasks.
Open-sourced code and analysis enable this work to serve as a reproducible baseline for distribution grid generation research.

Limitations & Future Work¶

Absence of physical constraints: Generated graphs consider only topological structure without embedding power flow feasibility, load constraints, or other physical conditions; the generated networks are not guaranteed to constitute valid distribution grids.
Topology-only evaluation: Node and edge attributes (e.g., line impedance, load magnitude) are not considered; only graph structure is reconstructed.
Limited evaluation metrics: Only degree distributions and Laplacian spectra are used; grid-specific constraints such as radial topology preservation rate and connectivity guarantees are not assessed.
No comparison with recent graph generation methods: Graph diffusion models (e.g., DiGress) have demonstrated strong performance in molecular generation and may be better suited to the hard-constrained setting of power grid generation.

vs. FeederGAN: FeederGAN generates distribution grids via GANs but without open-source code or generalizability validation. This work achieves a more flexible decoder design under the VGAE framework and releases the code publicly.
vs. DeepGDL: DeepGDL employs deep generative learning but similarly suffers from reproducibility issues. The systematic comparison across two datasets in this work provides a more reliable baseline.
The central insight of this work is that naively applying standard graph generative models to complex constrained settings (such as power grids) is insufficient; more expressive generative families (attention-based decoders, diffusion models) combined with domain-specific constraints are needed.

Rating¶

Novelty: ⭐⭐⭐ The method itself (VGAE + standard decoders) is not novel, but the systematic evaluation and failure mode analysis are valuable.
Experimental Thoroughness: ⭐⭐⭐ The four-decoder × two-dataset comparison is reasonable, though comparisons against non-VGAE baselines are absent.
Writing Quality: ⭐⭐⭐⭐ Well-structured with honest reporting of negative results.
Value: ⭐⭐⭐ Establishes an important baseline for synthetic distribution grid generation and identifies the performance ceiling of VGAE.