Lagrangian neural ODEs: Measuring the existence of a Lagrangian with Helmholtz metrics¶
Conference: NeurIPS 2025 arXiv: 2510.06367 Code: GitHub Area: Physics-Informed Learning / Neural ODE Keywords: Neural ODE, Lagrangian mechanics, Helmholtz conditions, physics regularization, Euler-Lagrange equations
TL;DR¶
This paper proposes Helmholtz metrics — differentiable metrics derived from the Helmholtz conditions — to quantify how closely a given ODE approximates the Euler-Lagrange equations. These metrics are incorporated as regularization terms into second-order Neural ODE training, forming Lagrangian Neural ODEs that guide the model toward true physical laws with zero additional inference overhead.
Background & Motivation¶
Neural ODEs are powerful tools for learning dynamical systems from data, capable of learning ODEs of the form \(\dot{s} = h_\theta(t, s)\). However, not all ODEs carry physical meaning — the stationary action principle fundamental to physics requires that system trajectories satisfy the Euler-Lagrange equations. Standard Neural ODEs provide no mechanism to ensure that the learned ODE constitutes an Euler-Lagrange equation, potentially yielding unphysical solutions.
The core problem has two aspects: (1) the identification problem: how to differentiably quantify how closely an ODE approximates the Euler-Lagrange equations; and (2) the learning problem: how to guide Neural ODEs during training to converge toward true Euler-Lagrange equations.
Existing approaches such as Lagrangian Neural Networks (LNNs) directly predict a Lagrangian and derive the ODE from it, but this requires computing the Euler-Lagrange equations in both forward and backward passes, incurring high computational cost and poor stability. This paper adopts an inverse approach: learn the ODE directly, then verify whether it satisfies the Lagrangian structure via the Helmholtz conditions.
Method¶
Overall Architecture¶
The model consists of three networks: \(f_{\theta_1}\) models the acceleration \(\ddot{x}\), \(g_{\theta_2}\) learns the Hessian matrix of the Lagrangian, and \(\text{NN}_{\theta_3}\) predicts initial velocities from initial positions. Training jointly optimizes the regression loss \(\mathcal{L}_R\) and the Helmholtz metric regularization term \(\mathcal{L}_H\); at inference time, only \(f_{\theta_1}\) and \(\text{NN}_{\theta_3}\) are used.
Key Designs¶
-
Differentiable Implementation of Helmholtz Metrics:
- Function: Transforms the Helmholtz conditions into loss functions optimizable via neural networks.
- Mechanism: Defines auxiliary quantities \(\Phi\), parameterizes the Hessian matrix \(g\) with a neural network \(g_{\theta_2}\), and minimizes the MSE of residuals from three Helmholtz conditions. The minimum absolute eigenvalue \(\lambda_{\min}\) normalizes the residuals to prevent the network from "cheating" by learning small eigenvalues.
- Design Motivation: A differentiable, trainable metric is needed to quantify whether an ODE originates from a Lagrangian while avoiding degenerate solutions.
-
Multi-Objective Optimization Strategy:
- Function: Jointly optimizes the regression loss and the Helmholtz metric.
- Mechanism: The total loss is \(\mathcal{L}_{\text{tot}} = \mathcal{L}_R + \mathcal{L}_H\). Gradient clipping (clipping \(\|\nabla_{\theta_1} \mathcal{L}_H\|\) to \(c_1 \approx 0.05\)) ensures data dominates in the early training phase, preventing convergence to an incorrect Euler-Lagrange equation.
- Design Motivation: Overly strong regularization causes the model to converge to physical laws inconsistent with the data.
-
Zero Additional Inference Overhead:
- Function: Helmholtz metrics are used only during training and are entirely absent at inference time.
- Mechanism: \(g_{\theta_2}\) is computed and optimized only during training; at inference, only \(f_{\theta_1}\) is needed to evaluate the right-hand side of the ODE.
- Design Motivation: This is a core advantage over LNNs, which require computing Euler-Lagrange equations via automatic differentiation at inference time, incurring significant overhead.
Loss & Training¶
- Regression loss: \(\mathcal{L}_R = \text{MSE}(x_{\text{pred}}, x_{\text{data}})\)
- Helmholtz regularization: \(\mathcal{L}_H = \text{MSE}(\sum_i \mathcal{R}_i / \lambda_{\min})\)
- Training techniques: progressive time step inclusion (gradually increasing the number of time steps to avoid local minima); the output of \(g_{\theta_2}\) is processed with a \(\sinh\) transformation to handle exponential behavior.
- Network architecture: \(f_{\theta_1}\) (1 layer × 16), \(g_{\theta_2}\) (2 layers × 64), \(\text{NN}_{\theta_3}\) (3 layers × 16), Softplus activation; RAdam optimizer, batch size 128.
Key Experimental Results¶
Main Results¶
| System | Helmholtz Metric Behavior | Remarks |
|---|---|---|
| Undamped oscillator | \(\mathcal{L}_H\) decreases significantly | Lagrangian exists |
| Kepler problem | \(\mathcal{L}_H\) decreases significantly | Lagrangian exists |
| Damped oscillator (time-independent \(g\)) | \(\mathcal{L}_H\) fails to decrease | No time-independent Lagrangian exists |
| Damped oscillator (time-dependent \(g\)) | \(\mathcal{L}_H\) decreases significantly | Time-dependent Lagrangian exists |
| Non-Lagrangian ODE | \(\mathcal{L}_H\) improves only marginally | Correctly identified as having no Lagrangian |
Ablation Study¶
Comparison of 40 pairs of regularized vs. unregularized models (MSE ratio \(R = \exp(l_{\text{reg}} - l_{\text{unreg}})\)):
| Evaluation Dimension | MSE Ratio \(R\) | Significance |
|---|---|---|
| Position \(x\) (in-distribution) | < 1 | Significant (Welch's t-test) |
| Velocity \(\dot{x}\) | << 1 | Highly significant |
| Acceleration \(\ddot{x}\) | << 1 | Highly significant |
| Extrapolation (2× training time) | << 1 | Highly significant |
Key Findings¶
- Helmholtz metrics accurately distinguish between Lagrangian and non-Lagrangian systems.
- The learned \(g\) closely matches the analytical Lagrangian Hessian (Kepler problem: median error \(3.7 \times 10^{-4}\)).
- Regularization substantially improves learning accuracy for velocity and acceleration, with particularly notable gains in extrapolation performance.
Highlights & Insights¶
- Elegant inverse approach: Rather than directly modeling the Lagrangian as in LNNs, this work learns the ODE and subsequently verifies the existence of a Lagrangian, avoiding the overhead of forward Euler-Lagrange computation.
- Physical diagnostic capability: Beyond improving learning, the framework can diagnose whether a system is physical — the Helmholtz metric fails to converge for damped systems under time-independent settings, correctly reflecting the non-fundamental nature of damping.
- Solid theoretical foundation: Grounded in Douglas's classical Helmholtz condition theory (1939/1941), the work bridges century-old mathematical tools with modern deep learning.
Limitations & Future Work¶
- Validation is limited to low-dimensional (2D) toy systems; scalability to high-dimensional and more complex systems remains untested.
- No systematic quantitative comparison against LNNs or Hamiltonian Neural Networks is provided.
- Numerical stability may become problematic in higher dimensions (eigenvalue computation, robustness of gradient clipping).
- The expressive capacity of \(g_{\theta_2}\) may be insufficient when the system's Lagrangian takes a highly complex form.
Related Work & Insights¶
- Lagrangian Neural Networks (LNNs): A forward approach — predicts the Lagrangian to derive the ODE; the present work takes the inverse approach.
- Hamiltonian Neural Networks: An analogous idea formulated within the equivalent Hamiltonian framework.
- Physics-Informed Neural Networks (PINNs): A broader paradigm for learning with physical constraints.
- Implications for ML in the physical sciences: Helmholtz metrics can serve as a general diagnostic tool for learning physical systems.
Rating¶
- Novelty: ⭐⭐⭐⭐ Innovatively applies classical Helmholtz conditions as regularization for Neural ODEs.
- Experimental Thoroughness: ⭐⭐⭐ Validation systems are relatively simple; comparisons with competing methods are absent.
- Writing Quality: ⭐⭐⭐⭐ Mathematical derivations are clear; physical intuition is richly conveyed.
- Value: ⭐⭐⭐⭐ Introduces a new regularization paradigm for Physics-Informed ML.