Skip to content

🕸️ Graph Learning

🔬 ICLR2026 · 21 paper notes

A Geometric Perspective on the Difficulties of Learning GNN-based SAT Solvers

This paper proves, from the geometric perspective of graph Ricci curvature, that the bipartite graph representation of random k-SAT instances exhibits inherent negative curvature that decreases as problem difficulty increases. It establishes a theoretical connection between GNN oversquashing and SAT solving difficulty, and validates the theory through test-time graph rewiring.

Are We Measuring Oversmoothing in Graph Neural Networks Correctly?

This paper identifies that the widely adopted Dirichlet energy metric fails to correctly capture oversmoothing in GNNs under practical settings. It proposes the numerical/effective rank (Erank) of the feature representation matrix as an alternative measure. Empirically, Erank achieves an average correlation of 0.91 with accuracy (vs. 0.72 for Dirichlet energy), while on OGB-Arxiv, Dirichlet energy even exhibits an incorrect correlation direction. The paper further provides theoretical proofs that the numerical rank converges to 1 (rank collapse) for a broad family of GNN architectures, and redefines oversmoothing as rank collapse rather than feature vector alignment.

Beyond Simple Graphs: Neural Multi-Objective Routing on Multigraphs

This paper presents GMS, the first neural combinatorial optimization routing method for multigraphs, comprising two variants: GMS-EB, which performs edge-level autoregressive construction directly on the multigraph, and GMS-DH, a dual-head approach that learns to prune the multigraph before performing node-level routing. GMS achieves near-LKH performance on asymmetric multi-objective TSP and CVRP while being tens of times faster.

Cooperative Sheaf Neural Networks

This paper proposes in/out-degree sheaf Laplacians defined on directed graphs for cellular sheaves, and constructs a Cooperative Sheaf Neural Network (CSNN) that enables nodes to independently select information propagation/reception strategies, thereby simultaneously mitigating oversquashing and handling heterophilic tasks.

Embodied Agents Meet Personalization: Investigating Challenges and Solutions Through the Lens of Memory Utilization

This paper systematically evaluates the memory utilization capabilities of LLM-driven embodied agents through the Memento framework. It finds that existing agents can recall simple object semantics but fail to process sequential information in user behavior patterns. A hierarchical knowledge graph-based user profile memory module is proposed to effectively improve performance on personalized assistance tasks.

Entropy-Guided Dynamic Tokens for Graph-LLM Alignment in Molecular Understanding

This paper proposes EDT-Former (Entropy-guided Dynamic Token Transformer), which establishes efficient alignment between a frozen graph encoder and a frozen LLM via an entropy-guided dynamic token generation mechanism. Without fine-tuning the LLM backbone, EDT-Former achieves state-of-the-art performance across multiple benchmarks including molecular question answering, molecular instruction following, and property prediction.

Explore-on-Graph: Incentivizing Autonomous Exploration of LLMs on Knowledge Graphs

This paper proposes Explore-on-Graph (EoG), which leverages SFT followed by two-phase reinforcement learning (outcome reward + path-refined reward) to incentivize LLMs to autonomously explore reasoning paths on knowledge graphs beyond the training distribution, surpassing GPT-5 and Gemini 2.5 Pro on five KGQA benchmarks.

GRAPHITE: Graph Homophily Booster — Reimagining the Role of Discrete Features in Heterophilic Graph Learning

This paper proposes GRAPHITE, a learning-free graph transformation method that directly boosts graph homophily by introducing "feature nodes" as hubs to indirectly connect nodes sharing common features. It is the first approach to address heterophilic graph learning by modifying graph structure rather than redesigning GNN architectures, achieving substantial improvements over 27 state-of-the-art methods on challenging benchmarks such as Actor.

Graph Tokenization for Bridging Graphs and Transformers

This paper proposes GraphTokenizer, a framework that converts graphs into symbol sequences via invertible frequency-guided serialization, then applies BPE to learn a substructure vocabulary, enabling standard Transformers (e.g., BERT/GTE) to process graph data directly without any architectural modification, achieving state-of-the-art results on 14 benchmarks.

GraphUniverse: Synthetic Graph Generation for Evaluating Inductive Generalization

This paper proposes GraphUniverse, a framework that generates graph families with persistent semantic communities via a hierarchical architecture, enabling for the first time a systematic evaluation of inductive generalization in graph learning models. A key finding is that transductive performance cannot reliably predict inductive generalization ability.

Learning Concept Bottleneck Models from Mechanistic Explanations

This paper proposes Mechanistic CBM (M-CBM), which leverages Sparse Autoencoders to extract concepts from features learned by a black-box model, names and annotates them via a multimodal LLM, and constructs an interpretable Concept Bottleneck Model. Under controlled information leakage, M-CBM substantially outperforms existing CBM approaches.

LogicXGNN: Grounded Logical Rules for Explaining Graph Neural Networks

LogicXGNN proposes a post-hoc framework for extracting interpretable first-order logical rules from trained GNNs. The framework identifies predicates via graph structural hashing and hidden-layer embedding pattern recognition, determines discriminative DNF rule structures using decision trees, and grounds abstract predicates into the input space. The resulting rule-based classifier can serve as a substitute for the original GNN and also functions as a controllable graph generation model.

MolLangBench: A Comprehensive Benchmark for Language-Prompted Molecular Structure Recognition, Editing, and Generation

This paper introduces MolLangBench, a benchmark constructed via automated tools and expert annotation to provide high-quality, unambiguous evaluation datasets for the molecule-language interface. It covers three task types (recognition / editing / generation) and three modalities (SMILES / image / graph), evaluates 16+ commercial LLMs and 5 chemistry-specific models, and reveals that even GPT-5 falls significantly short on basic molecular operations (generation accuracy only 43%).

On the Expressive Power of GNNs for Boolean Satisfiability

This paper rigorously proves, from the perspective of the Weisfeiler-Leman (WL) test, that the complete WL hierarchy cannot distinguish satisfiable from unsatisfiable 3-SAT instances, revealing the theoretical expressiveness limits of GNNs for SAT solving. It also identifies positive instance families—such as planar SAT and random SAT—where GNNs can successfully distinguish satisfiability.

Pairwise is Not Enough: Hypergraph Neural Networks for Multi-Agent Pathfinding

This paper proposes HMAGAT, which replaces the pairwise message passing of GNNs with a directed hypergraph attention network to model group interactions in multi-agent pathfinding, surpassing a 85M-parameter SOTA model using only 1M parameters and 1% of the training data.

RAS: Retrieval-And-Structuring for Knowledge-Intensive LLM Generation

This paper proposes RAS, a framework that dynamically constructs query-specific knowledge graphs at inference time for each input question. Through three stages—iterative retrieval planning, text-to-triple conversion, and graph-augmented answering—RAS achieves structured reasoning and attains improvements of up to 7.0% and 8.7% over prior methods on 7 knowledge-intensive benchmarks for open-source and closed-source LLMs, respectively.

Relational Graph Transformer

This paper proposes RelGT, the first graph Transformer specifically designed for relational databases. Through multi-element tokenization (a 5-tuple of feature/type/hop distance/time/local structure encodings) and a local–global hybrid attention mechanism, RelGT consistently outperforms GNN baselines across all 21 tasks in the RelBench benchmark, with improvements of up to 18%.

Relatron: Automating Relational Machine Learning over Relational Databases

This work systematically compares relational deep learning (RDL/GNN) and deep feature synthesis (DFS) on predictive tasks over relational databases, finding that neither dominates uniformly and performance is highly task-dependent. The authors propose Relatron — a task-embedding-based meta-selector that leverages RDB task homophily and affinity embeddings for automatic architecture selection, achieving up to 18.5% improvement in joint architecture–hyperparameter search.

Revisiting Node Affinity Prediction in Temporal Graphs

This paper analyzes why simple heuristics (persistent forecasting, moving average) consistently outperform complex TGNNs on temporal graph node affinity prediction. It proves that these heuristics are special cases of linear SSMs and that standard RNNs/LSTMs/GRUs cannot express even the most basic persistent forecasting. Based on these findings, the paper proposes NAViS — a linear SSM architecture with a virtual global state and a ranking loss — which surpasses all baselines on TGB benchmarks.

Structurally Human, Semantically Biased: Detecting LLM-Generated References with Embeddings and GNNs

By constructing paired citation graphs (human vs. GPT-4o-generated vs. random baseline) for 10,000 papers, this work finds that LLM-generated reference lists are nearly indistinguishable from human ones in terms of graph topology (RF accuracy only 60%), yet are effectively detectable via semantic embeddings (RF 83%, GNN 93%). This indicates that LLMs accurately mimic citation topology while leaving detectable semantic fingerprints.

Towards Improved Sentence Representations using Token Graphs

This paper proposes Glot, a lightweight structure-aware pooling module that constructs a latent similarity graph from token-level hidden states of a frozen LLM, refines them via a GNN, and aggregates them into a sentence representation. Glot achieves competitive performance with fine-tuning-based methods on GLUE/MTEB while requiring 20× fewer parameters and 100× faster training.