Multifidelity Simulation-based Inference for Computationally Expensive Simulators¶

Conference: ICLR 2026
OpenReview: https://openreview.net/forum?id=bj0dcKp9t6
Code: To be confirmed
Area: Probabilistic Methods / Simulation-based Inference
Keywords: Multi-fidelity, Neural Posterior Estimation, Transfer Learning, Active Learning, Computational Neuroscience

TL;DR¶

The authors propose MF-(TS)NPE: pre-training a neural density estimator using cheap low-fidelity simulations followed by fine-tuning with a small number of expensive high-fidelity simulations, reducing the required high-fidelity simulation budget for Bayesian inference by up to two orders of magnitude.

Background & Motivation¶

Background: Scientific modeling often relies on stochastic simulators (neuron models, climate models, turbulence, etc.) to understand mechanisms. Simulation-based inference (SBI) approximates parameter posteriors through forward simulations without an analytical likelihood. Amortized Neural Posterior Estimation (NPE) fits \(p(\theta|x)\) directly using neural density estimators, while its truncated sequential variant (TSNPE) further improves stability and simulation efficiency for single observations.
Limitations of Prior Work: When simulators are computationally expensive or parameter dimensions are high, state-of-the-art NPE/TSNPE often requires a massive simulation budget to obtain reliable posteriors. For a high-fidelity neuron model, a single simulation may take several minutes, making 100,000 runs impractical.
Key Challenge: High-fidelity models are accurate but expensive, while low-fidelity models (simplified domain knowledge, dimensionality reduction, surrogate models) are cheap but inaccurate. The posteriors of the two typically differ, meaning low-fidelity results cannot directly replace high-fidelity inference.
Goal: To efficiently infer the parameter posterior of a high-fidelity model under extremely limited budgets and without analytical likelihoods, leveraging low-fidelity simulations.
Core Idea: [Transfer Learning + Active Learning] First, pre-train a neural density estimator on a large volume of low-fidelity simulations to learn general features, then fine-tune it using sparse high-fidelity simulations. In sequential scenarios, an acquisition function targeting predictive uncertainty is used to adaptively select high-fidelity parameter points.

Method¶

Overall Architecture¶

MF-(TS)NPE injects multi-fidelity logic into neural posterior estimation. It densely samples from the prior to run a low-fidelity simulator and train a density estimator. This estimator’s weights are then used as initialization for fine-tuning with sparse high-fidelity simulations under the same prior. Both amortized (MF-NPE) and non-amortized sequential (MF-TSNPE) paradigms are supported; the sequential variant can incorporate an acquisition function (MF-TSNPE-AF) for active learning.

flowchart LR
    A[Dense Sampling from Prior p(θ)] --> B[Low-fidelity Simulator pL: Cheap]
    B --> C[Train Density Estimator q_ψ(θ|xL) with NLL Loss]
    C -->|Weight ψ Initialization| D[Fine-tune q_φ(θ|x) with Sparse High-fidelity Simulations]
    D --> E[High-fidelity Posterior p(θ|xo)]
    F[Acquisition Function: Maximize V_φ|D] -.MF-TSNPE-AF.-> D

Key Designs¶

1. Transfer Learning with Low-fidelity Pre-training: MF-NPE adopts fine-tuning-based transfer. It first minimizes the negative log-likelihood \(L(\psi)=\mathbb{E}[-\log q_\psi(\theta|x_L)]\) on \(N\) low-fidelity pairs \((\theta, x_L)\) to train a low-fidelity estimator. The parameters \(\psi\) then initialize the high-fidelity estimator \(q_\phi\), which is further optimized \(L(\phi)\) on \(M \ll N\) high-fidelity pairs. The intuition is that the feature spaces of low- and high-fidelity estimators overlap significantly. Once the network learns task-related features, the sample complexity for related tasks drops significantly. Neural Spline Flows (NSF) are used for the density estimator, following SBI package validation set early-stopping criteria to prevent overfitting. This design naturally supports mismatched parameter counts: parameters missing in low-fidelity are treated as dummy variables during pre-training, and the network effectively estimates their prior; it also supports stacking more than two fidelity levels.

2. Sequential Multi-fidelity MF-TSNPE: For non-amortized inference on a fixed observation \(x_o\), MF-NPE serves as the first round of TSNPE. The high-fidelity estimator is initialized by the low-fidelity network, followed by iterative sampling of high-fidelity parameters from a truncated prior (covering the current posterior support). Truncated priors avoid instability and posterior leakage during sequential training with flexible density estimators, resulting in simpler loss functions and more stable training while retaining performance—especially beneficial in low high-fidelity budget regimes compared to standard TSNPE.

3. Acquisition Function for Epistemic Uncertainty: Beyond proposal samples \(\theta^{(i)}_{prop}\) in each round, MF-TSNPE-AF selects top-\(B\) active samples \(\theta^{(i)}_{active}\) based on an acquisition function. The goal is to maximize the variance of the posterior estimate regarding the epistemic uncertainty of network parameters: \(\theta^*=\arg\max_\theta \mathbb{V}_{\phi|D}[q_\phi(\theta|x_o)]\), implemented via sample variance from an ensemble of independently trained estimators. Note that epistemic uncertainty guides high-fidelity sampling within the simulator's domain rather than picking OOD samples, focusing the budget on the most uncertain and informative parameter regions.

4. When Transfer is Effective — Mutual Information + Representation Consistency: The authors empirically characterize the benefits of pre-training based on two factors: Mutual Information (MI) between low- and high-fidelity simulators and Representation Consistency (similarity in how task-relevant information is encoded). By applying controlled perturbations to OU processes, they demonstrate that the lower bound of transfer error decreases as MI increases. When MI is very low, MF-NPE degrades to performance comparable to standard NPE with the same high-fidelity budget—providing an actionable criterion for selecting low-fidelity models.

Key Experimental Results¶

Main Results (6 Tasks)¶

4 benchmarks (SIR, SLCP, OU process, high-dimensional Gaussian Blob) + 2 expensive neuroscience tasks. Metrics include C2ST, MMD (with ground truth), NLTP, and NRMSE (without ground truth), averaged across 10 observations and 10 initializations.

Task Type	Results
4 Benchmarks	In low-budget regimes (50–10³ high-fi simulations), MF-NPE consistently outperforms NPE; MF-TSNPE(-AF) outperforms TSNPE. Benefits increase with more low-fi samples, though OU/SLCP reach saturation at 10⁴→10⁵ samples.
Comparison with MF-ABC	MF-NPE significantly outperforms multi-fidelity ABC-based methods.
Multi-compartment Neuron (L5PC, 8-comp high-fi vs 1-comp low-fi)	Total computational cost is 4.44±0.06 times lower than standard NPE for equivalent performance; posterior predictive matches empirical data better; TARP/SBC are well-calibrated.
Recurrent Spiking Network (4096E+1024I, 24-param high-fi / 12-param low-fi mean-field)	The ratio of posterior samples falling within the target firing rate range increased by nearly 30%; high-fi takes ~5 mins/run vs. near-instant mean-field.

Ablation Study¶

Acquisition Function: MF-TSNPE-AF outperforms MF-TSNPE on OU processes but shows no significant gain for SLCP/SIR. The overhead of ensemble training makes it cost-effective only when simulation costs far exceed training costs.
Parameter Mismatch: Extra parameters in high-fidelity models increase inference complexity and decrease MF-NPE performance, yet it still outperforms NPE and MF-ABC. MF-NPE also outperforms NPE when the low-fidelity model has more parameters.
Transfer Validity: Empirical results support the MI + Representation Consistency hypothesis; MF-NPE ≈ NPE when MI is low.

Key Findings¶

On neuroscience tasks, MF-(TS)NPE reduces the required high-fidelity simulations by up to two orders of magnitude while maintaining comparable performance.
There is an upper bound on low-fidelity pre-training benefits: marginal gains diminish after a certain low-fidelity budget.

Highlights & Insights¶

Systematically porting the deep learning common sense of "transfer learning as fine-tuning" into SBI. The method is simple, adds almost no hyperparameters, and saves orders of magnitude in simulation costs.
Unifies amortized, non-amortized, and active learning paradigms, while naturally handling parameter space mismatches—a common occurrence in scientific scenarios (e.g., mean-field vs. full network).
Beyond just providing a method, it attempts to answer "what kind of low-fidelity model is worth using" through MI and representation consistency criteria, offering more guidance than simple benchmarking.

Limitations & Future Work¶

The effectiveness of transfer is currently characterized empirically, lacking formal convergence rate theoretical guarantees (existing transfer theories often rely on linear networks, which do not fully cover MF-NPE).
The active learning variant MF-TSNPE-AF is slower to train due to ensemble overhead, and its gains are inconsistent; it is only worth using when simulations are significantly more expensive than training.
It assumes low- and high-fidelity simulators share the same observation domain and that low-fidelity parameters are a subset of high-fidelity ones, relying on expert-designed low-fidelity models. Automated construction of low-fidelity surrogates remains an open problem.
Gains decrease as the parameter space discrepancy grows, offering limited help in scenarios with a large "semantic gap" between fidelities.

Multi-fidelity Inference: Multi-fidelity methods under the ABC framework (MF-ABC) are limited by high-dimensional spaces. Concurrent works include response distillation, multi-level Monte Carlo, and transfer learning in cosmology for multi-fidelity SBI.
Transfer Learning + Simulators: Transfer learning has been used to reduce budgets in CO2 prediction, surrogate modeling, and PINN inversions, but its application in SBI was previously underexplored.
Simulation-efficient SBI: Single-fidelity routes like active learning for parameter selection, signature features, ensemble models, and self-consistency objectives exist. Ours differs by explicitly utilizing expert-designed low-fidelity simulators combined with transfer and active learning.
Inspiration: The "pre-train, fine-tune" paradigm can be systematically migrated to Bayesian inference for any expensive forward model. The "MI + Representation Consistency" criteria can be generalized as a diagnostic tool for multi-fidelity/transfer feasibility.

Rating¶

Novelty: ⭐⭐⭐⭐ Systematically introduces transfer learning into neural posterior estimation and unifies amortized/sequential/active learning, complemented by the MI diagnostic analysis. Clear and fills a gap in expensive high-fidelity SBI.
Experimental Thoroughness: ⭐⭐⭐⭐ 4 benchmarks + 2 real neuroscience tasks, multi-metrics (C2ST/MMD/NLTP/NRMSE), calibration checks, and comprehensive ablations on parameter mismatch and acquisition functions.
Writing Quality: ⭐⭐⭐⭐ Smooth logic in motivation and methodology. Diagram 1 is clear. The "When Pre-training is Effective" section elevates the level of insight.
Value: ⭐⭐⭐⭐ Directly useful for fields with expensive simulators like computational neuroscience and systems biology; the order-of-magnitude savings make previously infeasible inference possible.