Skip to content

Integration Matters for Learning PDEs with Backward SDEs

Conference: NeurIPS 2025 arXiv: 2505.01078 Code: https://github.com/sungje-park/heunbsde Area: Scientific Computing / PDE Solving Keywords: BSDE, PDE solving, Stratonovich integration, Heun method, discretization bias

TL;DR

This paper identifies the root cause of why standard BSDE methods underperform PINNs — an irreducible discretization bias introduced by Euler-Maruyama integration — and proposes Heun-BSDE based on the Stratonovich formulation to fully eliminate this bias, achieving competitive performance against PINNs on high-dimensional PDEs.

Background & Motivation

Background: Two major deep learning approaches exist for solving high-dimensional PDEs: PINNs (directly minimizing PDE residuals) and BSDE methods (reformulating PDEs as forward-backward SDEs and simulating trajectories). BSDE methods have natural advantages for problems with underlying dynamics, such as stochastic optimal control.

Limitations of Prior Work: Empirically, BSDE methods perform significantly worse than PINNs, yet the underlying reason has remained unclear. Prior work [33] proposed a hybrid interpolation loss to narrow the gap, but introduced hyperparameters requiring tuning without explaining the root cause.

Key Challenge: When the standard BSDE discretizes the one-step consistency loss using Euler-Maruyama (EM) integration, it introduces an irreducible bias term independent of the step size \(\tau\): \(\text{Bias}(\theta) = \frac{1}{2T}\int_0^T \mathbb{E}\text{tr}((H \cdot \nabla^2 u_\theta)^2)dt\), causing the optimization objective to deviate from the true solution.

Goal: (1) Identify the root cause of the BSDE vs. PINNs performance gap; (2) Propose a bias-free integration scheme.

Key Insight: Reinterpreting the BSDE as a Stratonovich SDE (rather than an Itô SDE) and applying stochastic Heun integration (which converges to the Stratonovich solution) to eliminate the bias.

Core Idea: Replace Itô + EM integration with Stratonovich + Heun integration to eliminate the discretization bias in the BSDE loss.

Method

Overall Architecture

The forward-backward SDE system for PDE solving is reformulated in Stratonovich form and discretized using the stochastic Heun method (a two-step predictor-corrector scheme), yielding an unbiased one-step consistency loss.

Key Designs

  1. Bias Analysis (Theorem 4.2):

    • Function: Theoretically proves that the bias in the EM one-step loss is irreducible.
    • Mechanism: \(\tau^{-2} \cdot \ell_{\text{EM},\tau}(\theta,x,t) = (R[u_\theta])^2 + \frac{1}{2}\text{tr}((H \cdot \nabla^2 u_\theta)^2) + O(\tau^{1/2})\), where the second term is a \(\tau\)-independent bias that persists even as \(\tau \to 0\).
    • Design Motivation: Explains why reducing the step size fails to improve EM-BSDE.
  2. Stratonovich-Heun BSDE Loss (Theorem 4.4):

    • Function: Proposes an unbiased loss function.
    • Mechanism: \(L_{\text{Heun},\tau}(\theta) = \frac{1}{T}\int_0^T \mathbb{E}[(R[u_\theta])^2]dt + O(\tau^{1/2})\) — the bias term is only \(O(\tau^{1/2})\) and can be eliminated by reducing the step size.
    • Design Motivation: The Heun method is a second-order predictor-corrector scheme that naturally converges to the Stratonovich solution.
  3. Efficient Sub-sampling Implementation:

    • Function: Accelerates training.
    • Mechanism: A full forward SDE trajectory is rolled out (with stop gradient), followed by random sub-sampling of \(B\) time steps to compute the loss, rather than using all \(N\) steps.
    • Design Motivation: Reduces per-step computational cost while maintaining performance.

Loss & Training

  • Heun discretization requires one additional function evaluation (predictor step + corrector step) but permits larger step sizes.
  • The one-step loss (\(k=1\)) suffices; multi-step loss hyperparameter tuning is unnecessary.

Key Experimental Results

Main Results (100D PDE, one-step loss)

PDE PINNs FS-PINNs EM-BSDE Heun-BSDE
100D HJB 0.1260 0.0737 0.3626 0.0493
100D BSB 1.5066 0.0497 0.3735 0.0535
10D BZ 3.8566 0.0351 0.1903 0.0228

Ablation Study (varying step count \(k\))

Steps \(k\) EM-BSDE (100D HJB) Heun-BSDE (100D HJB)
\(k=1\) 0.3626 0.0493
\(k=5\) 0.2117 0.0640
\(k=50\) 0.0858 0.0601

Key Findings

  • The EM-BSDE bias is indeed the root cause of the performance gap: EM-BSDE is substantially outperformed by Heun-BSDE across all experiments.
  • Heun-BSDE is competitive with PINNs: It surpasses PINNs on HJB and BZ, and approaches PINNs on BSB.
  • EM-BSDE requires multiple steps to partially mitigate bias, but multi-step formulations introduce optimization difficulties; Heun-BSDE achieves strong results with a single step.
  • Sub-sampling incurs negligible performance loss while substantially accelerating training.

Highlights & Insights

  • A neglected algorithmic detail determines overall method success: The choice of integration scheme (EM vs. Heun) has never been seriously studied in the BSDE literature, yet it is the root cause of the performance gap. This serves as a reminder that implementation details may matter more than methodological innovations.
  • Theory-driven algorithmic improvement: The problem is identified and the solution designed through rigorous bias analysis rather than empirical trial-and-error, with theory and experiments in perfect agreement.
  • The importance of Stratonovich vs. Itô in numerical implementation: Although the two formulations are equivalent in the continuous limit, the Stratonovich form is more amenable to numerical discretization.

Limitations & Future Work

  • The Heun method requires two function evaluations per step (vs. one for EM), doubling computational cost.
  • Experiments are conducted on only three standard PDE benchmarks and have not been validated on more complex real-world problems.
  • All methods perform poorly on the 100D BZ problem (RL2 > 1.7), indicating that high-dimensional coupled FBSDEs remain highly challenging.
  • Integration with adaptive step-size strategies is not discussed.
  • vs. PINNs: PINNs directly minimize PDE residuals and do not suffer from integration bias, but require explicit knowledge of the PDE; BSDE methods can learn from simulation.
  • vs. interpolation loss in [33]: [33] mitigates bias by tuning the optimal number of steps; Heun-BSDE eliminates bias fundamentally, requiring no such tuning.

Rating

  • Novelty: ⭐⭐⭐⭐ Identifies and explains a previously overlooked yet important issue with an elegant solution.
  • Experimental Thoroughness: ⭐⭐⭐⭐ Theory and experiments are mutually corroborating, with thorough ablations.
  • Writing Quality: ⭐⭐⭐⭐⭐ Mathematically rigorous and clearly presented.
  • Value: ⭐⭐⭐⭐ Makes an important contribution to the BSDE-PDE community.