ICML 2026 Scientific Computing mesh super-resolution semi-supervised regression complementary learning message-passing inductive bias PDE simulation acceleration

Semi-Supervised Neural Super-Resolution for Mesh-Based Simulations¶

Conference: ICML 2026
arXiv: 2605.09284
Code: https://github.com/jykim-git/SuperMeshNet.git
Area: 3D Vision / Physics Simulation / Graph Neural Networks
Keywords: mesh super-resolution, semi-supervised regression, complementary learning, message-passing inductive bias, PDE simulation acceleration

TL;DR¶

SuperMeshNet employs two complementary MPNNs: the primary model predicts LR→HR, while the auxiliary model predicts the HR-HR difference corresponding to LR-LR. These models generate pseudo-labels for unpaired HR samples through mutual supervision. Combined with lightweight inductive biases at the node/message levels, this approach enables PDE mesh super-resolution to surpass fully supervised baselines using only 10% HR data, consistently reducing RMSE across six MPNN architectures.

Background & Motivation¶

Background: Mesh-based PDE simulations (e.g., FEM, FVM) balance solution accuracy and computational cost based on mesh size. Fine meshes are accurate but expensive. Neural super-resolution aims to predict HR solutions from cheaper LR simulations. Existing methods fall into two categories: CNN-based (requiring interpolation of irregular meshes to regular grids, inefficient) and MPNN-based (directly operating on graphs but requiring extensive paired HR supervision).

Limitations of Prior Work: Acquiring HR data is inherently expensive—fine mesh simulations are costly—making "fully supervised" approaches contradictory. Unsupervised methods like PhySRNet incorporate PDE residuals into the loss but are limited to regular grids using finite differences. MAgNet performs zero-shot interpolation but suffers from high prediction errors compared to supervised methods.

Key Challenge: HR data scarcity vs. MPNN training demands. Classical semi-supervised regression methods (e.g., Mean Teacher, UCVME, TNNR) assume both models predict the "same target," leading to highly correlated pseudo-labels that reinforce errors, rendering them ineffective for MPNN super-resolution.

Goal: (1) Introduce semi-supervision to mesh-based super-resolution, compatible with any MPNN; (2) Design a mechanism where two models predict "different but related targets," ensuring complementary pseudo-labels and decorrelated errors; (3) Systematically summarize useful MPNN inductive biases for super-resolution.

Key Insight: From a physics perspective, two HR solutions governed by the same PDE but with different parameters \(\mu\) describe the system's response to parameter perturbations. A model dedicated to learning such differences provides pseudo-labels orthogonal to direct HR predictions, breaking pseudo-label collapse.

Core Idea: The primary model \(F_\theta\) learns the inter-resolution map \(u_l \to u_h\), while the auxiliary model \(G_\phi\) learns the intra-resolution difference \((u_l^r, u_l^s) \to (u_h^r - u_h^s)\). These models mutually supervise each other on unpaired LR data.

Method¶

Overall Architecture¶

The dataset is split into paired LR–HR samples \(\mathcal{D}_a=\{(u_l^q, u_h^q)\}_{q=1}^{N_h}\) (small, \(N_h \ll N\)) and unpaired LR samples \(\mathcal{D}_b=\{u_l^q\}_{q=N_h+1}^{N}\). Each batch includes two paired LR samples \(\alpha, \beta\) and one unpaired LR sample \(\gamma\). The primary model \(F_\theta(u_l^q)=\hat{u}_h^q\) is used for inference, while the auxiliary model \(G_\phi(u_l^r, u_l^s)=\hat{u}_h^{rs}\) predicts HR differences during training. Supervised training uses true HR labels from \(\alpha, \beta\), while unsupervised training uses pseudo-labels generated by one model to train the other. Both models share an LR encoder for efficiency. The primary model is based on SRGNN, with dual-path fusion via kNN-upsampler and latent-space upsampler.

Key Designs¶

Complementary Dual Models + Mutual Supervision:
- Function: Ensure the two models predict "different but physically related" targets, breaking pseudo-label collapse.
- Mechanism: Supervised losses are \(\mathcal{L}_{F,sup} = \ell(\hat{u}_h^\alpha, u_h^\alpha) + \ell(\hat{u}_h^\beta, u_h^\beta)\) and \(\mathcal{L}_{G,sup} = \ell(\hat{u}_h^{\alpha\beta}, u_h^\alpha - \text{kNN}(u_h^\beta;P_h^\beta\to P_h^\alpha))\). Unsupervised losses: \(\mathcal{L}_{F,unsup}\) uses \(\hat{u}_h^{\gamma\alpha} + u_h^\alpha\) as pseudo-labels for \(F_\theta(u_l^\gamma)\); \(\mathcal{L}_{G,unsup}\) uses \(\hat{u}_h^\gamma - u_h^\alpha\) as pseudo-labels for \(G_\phi(u_l^\gamma, u_l^\alpha)\).
- Design Motivation: Classical methods (e.g., Mean Teacher) use two isomorphic networks predicting the same target, leading to pseudo-label collapse. Here, the two models predict in different spaces (HR solution vs. HR difference), naturally decorrelating errors and providing prior knowledge of parameter sensitivity.
kNN Interpolation for Mesh Mismatch:
- Function: Define "HR differences" across different HR meshes.
- Mechanism: Different \(\mu\) values lead to different geometries, with \(P_h^r \ne P_h^s\). kNN interpolation projects one mesh onto the other's nodes \(\text{kNN}(u_h^s; P_h^s \to P_h^r)\) before subtraction. All unsupervised loss terms involving differences use direction-specific kNN projections.
- Design Motivation: Mesh-based simulations differ from CNN/regular grids due to irregular structures. kNN interpolation is a lightweight, differentiable solution akin to PointNet, avoiding the need for additional alignment networks.
MPNN-Agnostic Inductive Bias: Node-Level / Message-Level Centering:
- Function: Training tricks that improve super-resolution performance across architectures.
- Mechanism: After updating node embeddings, apply \(x_i \leftarrow x_i - \frac{1}{n}\sum_i x_i\). For architectures with explicit message aggregation (e.g., MGN), also apply \(agg_i \leftarrow agg_i - \frac{1}{n}\sum_i agg_i\). This removes global means from intermediate representations.
- Design Motivation: Super-resolution relies on local relative structures rather than absolute means. Centering smooths the loss landscape (similar to BN) but benefits tasks independent of global means. Ablation studies show consistent RMSE reductions across GCN/SAGE/GAT/GTR/GIN/MGN (e.g., MGN 0.0269→0.0226).

Loss & Training¶

The total losses are \(\mathcal{L}_F = \mathcal{L}_{F,sup} + \mathcal{L}_{F,unsup}\) and \(\mathcal{L}_G = \mathcal{L}_{G,sup} + \mathcal{L}_{G,unsup}\), with equal weights and no scheduling. For multi-output tasks (e.g., velocity + pressure), weighted MSE is used: 99:1 for time-dependent PDE datasets, \(10^{-8}:1\) for real geometry datasets, addressing scale differences. Adam (\(\text{lr}=10^{-3}\)), PyTorch AMP; hardware: i9-10920X + RTX A6000.

Key Experimental Results¶

Main Results¶

Dataset 1 (linear elasticity von Mises stress, FEM), RMSE↓ across six MPNNs:

Method	\(N_h\), \(N\)	GCN	SAGE	GAT	GTR	GIN	MGN
Fully Supervised (No Bias)	20, 20	0.0874	0.0876	0.0826	0.0758	0.0819	0.0655
Fully Supervised (No Bias)	200, 200	0.0575	0.0544	0.0512	0.0450	0.0381	0.0228
SuperMeshNet-O (No Bias)	20, 200	0.0613	0.0589	0.0544	0.0451	0.0404	0.0269
SuperMeshNet (With Bias)	20, 200	0.0431	0.0450	0.0457	0.0385	0.0277	0.0226

Real geometry (motorbike + rider, incompressible Navier-Stokes), drag/lift coefficients (relative error):

Method	\(N_h\), \(N\)	Drag (rel. err)	Lift (rel. err)
Ground Truth HR	—	0.3724	0.0368
SuperMeshNet	40, 200	0.3778 (0.014)	0.0433 (0.177)
Fully Supervised	200, 200	0.3653 (0.019)	0.0380 (0.033)

Ablation Study¶

Dataset 1, MGN, \(N_h=20, N=200\), inductive bias ablation:

Configuration	RMSE	Description
No Bias (O)	0.0269	Complementary learning only
+ Node Centering (N)	0.0237	Node centering alone captures most gains
+ Message Centering (M)	0.0247	Message centering alone slightly weaker than N
N + M	0.0226	Best performance with both

Semi-supervised regression baselines (Dataset 1, \(N_h=20, N=200\), MGN):

Method	RMSE	Training Time (s)
Mean-Teacher	0.0325	693.84
TNNR	0.0624	477.48
UCVME	0.0293	1122.62
SuperMeshNet-O	0.0269	503.2
SuperMeshNet	0.0226	421

Key Findings¶

Achieves better performance than fully supervised baselines with only 10% HR data, reducing HR requirements by 90%. This is critical as fine mesh data generation costs grow exponentially with resolution.
Complementary learning achieves the lowest RMSE and shortest training time (421 s vs. UCVME 1122 s) due to shared encoder reuse, unlike other semi-supervised methods with redundant computations.
On time-dependent PDE datasets, where HR and LR differ significantly (128× node ratio), fully supervised methods fail, but SuperMeshNet reproduces HR solutions, demonstrating the auxiliary model's learning signal.

Highlights & Insights¶

The "two models predicting different physical quantities coupled via common HR" paradigm elegantly combines co-training with PDE symmetry, generalizable to parameterized solution families (e.g., climate, biomechanics, lattice simulations).
Node/message centering is a simple yet robust trick, improving six MPNN architectures uniformly with minimal code changes, highlighting its utility for "relative structure" tasks.
The focus on HR data savings (90%) rather than absolute RMSE reduction reflects a pragmatic approach to addressing bottlenecks in mesh-based super-resolution.

Limitations & Future Work¶

Complementary learning requires longer training times than fully supervised methods (though shorter than other semi-supervised methods). The approach is cost-effective only for sufficiently fine meshes where HR data generation costs dominate.
Theoretical guarantees for training stability are absent, with only empirical studies in the appendix. When the auxiliary model \(G_\phi\) has high errors, mutual amplification of errors remains possible, especially for highly nonlinear or bifurcating PDEs.
HR sample selection is critical (explored empirically in Appendix I.12) but currently relies on random sampling. Active learning strategies for HR sampling could further reduce \(N_h\).

vs. PhySRNet (Arora, 2022): Fully unsupervised but limited to regular grids using finite differences. This work requires minimal HR data but handles irregular meshes.
vs. MAgNet (Boussif et al., 2022): Zero-shot MPNN interpolation with significantly higher errors than supervised methods. This work achieves lower errors with minimal HR data.
vs. UCVME / Mean Teacher / TNNR: These semi-supervised methods use "same-target dual networks," leading to pseudo-label collapse. This work's "different-target dual networks" fundamentally decorrelate errors.

Rating¶

Novelty: ⭐⭐⭐⭐ Combines "dual-model different targets + physical differences" for mesh super-resolution, a true first, and MPNN-compatible.
Experimental Thoroughness: ⭐⭐⭐⭐⭐ Comprehensive coverage: six MPNNs × three FEM datasets + three CFD datasets + semi-supervised baselines + inductive bias ablation.
Writing Quality: ⭐⭐⭐⭐ Rigorous use of physics/mathematics, clear pipeline diagrams, and extensive appendices.
Value: ⭐⭐⭐⭐ 90% HR data savings directly address real-world bottlenecks in industrial CAE and climate simulations, with open-source code readily usable.