GraphTOP: Graph Topology-Oriented Prompting for Graph Neural Networks¶

Conference: NeurIPS 2025 arXiv: 2510.22451 Code: https://github.com/xbfu/GraphTOP Area: Graph Learning / Graph Prompting Keywords: Graph Prompting, Topology, Edge Rewiring, Pre-training, Gumbel-Softmax

TL;DR¶

This paper proposes GraphTOP, the first graph topology-oriented prompting framework, which formulates topology-oriented prompting as an edge rewiring problem and relaxes it into a continuous space via Gumbel-Softmax. GraphTOP outperforms six baseline methods across five datasets and four pre-training strategies.

Background & Motivation¶

Background: The "pre-train then adapt" paradigm is widely adopted in graph learning. Graph prompting, which adapts frozen pre-trained GNNs by modifying the input graph data, is an efficient adaptation strategy.

Limitations of Prior Work: Existing graph prompting methods are almost exclusively feature-oriented — they operate only on node features or latent representations (e.g., GPF, All-in-one, GraphPrompt), entirely overlooking graph topology, which is an intrinsic characteristic of graph-structured data. However, graph representations are determined not only by features but also by topological structure.

Key Challenge: How can prompting be achieved by modifying graph topology? Edge selection is a discrete optimization problem that is not directly amenable to gradient-based optimization.

Goal: To design a topology-oriented graph prompting framework that adapts pre-trained GNN models to downstream node classification tasks through topological modification.

Key Insight: Topology prompting is formulated as an edge rewiring problem, relaxed into a continuous probability space via Bernoulli reparameterization and Gumbel-Softmax.

Core Idea: The framework learns the existence probability of each edge as a topology prompt, renders it differentiably trainable through Gumbel-Softmax, and employs entropy regularization to ensure tight relaxation and graph sparsity.

Method¶

Overall Architecture¶

Given a pre-trained GNN \(f_{\theta^*}\) and an input graph \(\mathcal{G} = (\mathbf{A}, \mathbf{X})\), GraphTOP learns a prompted graph \(\tilde{\mathcal{G}} = (\mathbf{S}, \mathbf{X})\), where \(\mathbf{S}\) is a learned adjacency matrix. By substituting \(\mathbf{S}\) for the original \(\mathbf{A}\), the frozen GNN can generate more task-appropriate representations for downstream tasks.

Key Designs¶

Edge Rewiring Formulation:
- Function: Models the topology prompt as a binary decision \(s_{ij} \in \{0,1\}\) for each node pair regarding edge existence.
- Mechanism: Each \(s_{ij}\) is treated as a Bernoulli random variable with parameter \(p_{ij}\), made differentiably sampleable via Gumbel-Softmax reparameterization. Theorem 1 proves that \(\Pr(G_1 - G_2 + \log(p/(1-p)) \geq 0) = p\).
- Design Motivation: Relaxes the discrete edge selection problem into continuous probability optimization, enabling gradient-based training.
Shared Trainable Projector:
- Function: Computes edge probabilities via a 2-layer MLP applied to pre-trained GNN node representations: \(p_{ij} = \sigma(W_2(\text{ReLU}(W_1(h_i + h_j))))\).
- Mechanism: Rather than learning each \(p_{ij}\) independently (which would cause parameter explosion), a shared projector infers edge probabilities from node representations.
- Design Motivation: Substantially reduces parameter count while leveraging semantic information encoded in pre-trained representations to guide topological modification.
Subgraph-Constrained Rewiring:
- Function: Restricts edge rewiring to the \(\rho\)-hop local subgraph of each target node (default \(\rho=2\)).
- Mechanism: Only edges between the target node and other nodes within the subgraph are rewired; all remaining edges are preserved. Complexity is reduced from \(O(D^{2\rho})\) to \(O(D^{\rho})\).
- Design Motivation: GNN representations depend primarily on local subgraphs; global rewiring is unnecessary and computationally expensive.

Loss & Training¶

Total loss: \(\mathcal{L}_P + \lambda_1 \mathcal{L}_E + \lambda_2 \mathcal{L}_S\)
\(\mathcal{L}_P\): Cross-entropy loss for node classification.
\(\mathcal{L}_E\): Entropy regularization (encourages probabilities toward 0 or 1 to ensure tight relaxation).
\(\mathcal{L}_S\): Sparsity regularization (controls edge density in the prompted graph, with \(\gamma=0.5\)).
Temperature annealing: \(\tau\) decays linearly, transitioning from probabilistic approximation to deterministic output.

Key Experimental Results¶

Main Results — 5-shot Node Classification¶

Pre-training	Method	Cora	PubMed	Amazon	Minesweeper	Flickr
GraphCL	Linear Probe	55.69	67.30	23.19	67.59	29.31
GraphCL	ProNoG (SOTA)	60.01	68.17	23.26	65.48	26.17
GraphCL	GraphTOP	65.12	69.42	27.15	70.23	31.84

Ablation Study¶

Configuration	Key Findings
w/o entropy regularization \(\mathcal{L}_E\)	Topological instability at inference; performance degrades.
w/o sparsity regularization \(\mathcal{L}_S\)	Graph becomes overly dense; performance degrades.
Feature prompt vs. topology prompt	Topology prompting and feature prompting are complementary.

Key Findings¶

GraphTOP consistently outperforms all baselines across four pre-training strategies (GraphCL, SimGRACE, LP-GPPT, LP-GraphPrompt).
Theorem 2 theoretically proves that edge rewiring increases inter-class representation distance: \(\text{Dist}' = \frac{p+q}{|p-q|} \text{Dist} > \text{Dist}\).
The additional computational overhead of topology prompting is minimal: \(O(D^2 d_l^2)\) compared to \(O(KD^2 d_l^2)\) for the GNN itself.

Highlights & Insights¶

First exploration of topology-oriented graph prompting: All prior work operates exclusively on features; this paper fills the gap in the topological dimension.
Elegant combination of Gumbel-Softmax and entropy regularization: Gumbel-Softmax enables differentiable edge selection, while entropy regularization ensures that edge probabilities converge toward deterministic values, yielding stable topology at inference.
Theoretical guarantees: Based on the CSBM model, the paper formally proves that topology prompting increases inter-class distance, providing rigorous theoretical support.

Limitations & Future Work¶

Node classification only: The current framework focuses on node-level tasks; topology prompt design for graph-level and edge-level tasks remains unexplored.
5-shot setting: Strong performance under extremely low annotation budgets is demonstrated, but effectiveness in fully supervised settings has not been validated.
Information loss from subgraph constraints: The \(\rho=2\) restriction may miss important long-range connections.
Potential improvements: Joint optimization of topology and feature prompts; exploration of adaptive \(\rho\).

vs. GPF/All-in-one: These methods modify only node features, whereas GraphTOP modifies topological structure; the two approaches are orthogonal and complementary.
vs. Graph Structure Learning (GSL): GSL also modifies graph structure but requires end-to-end GNN training; GraphTOP keeps the GNN frozen, making it more efficient.

Rating¶

Novelty: ⭐⭐⭐⭐⭐ First extension of graph prompting to the topological dimension.
Experimental Thoroughness: ⭐⭐⭐⭐ Five datasets × four pre-training strategies, though limited to node classification.
Writing Quality: ⭐⭐⭐⭐ Theoretical analysis is clear; notation is dense but well-organized.
Value: ⭐⭐⭐⭐ Opens a new direction for graph prompting with practical application potential.