Simulation-Based Inference for Neutrino Interaction Model Parameter Tuning¶

Conference: NeurIPS 2025 arXiv: 2510.07454 Code: GitHub (open source) Area: Physics Keywords: simulation-based inference, neutrino scattering, neural posterior estimation, GENIE, parameter tuning

TL;DR¶

This work presents the first application of simulation-based inference (SBI) to neutrino interaction model parameter tuning. Using neural posterior estimation (NPE), the method learns the posterior distribution of 4 physical parameters from 200K GENIE-simulated 58-bin histograms, and accurately recovers the ground-truth parameter values on mock data from the MicroBooNE Tune.

Background & Motivation¶

Background: Neutrino experiments require precise simulations of neutrino–nucleus interactions; however, the underlying theoretical understanding remains incomplete, and simulators rely on semi-empirical approximations. Experimental collaborations typically tune the physical parameters of simulators such as GENIE to reference data in order to obtain reliable predictions.

Limitations of Prior Work: (a) Conventional tuning methods employ simple likelihood fits, but MicroBooNE encountered pathological results in initial attempts and was forced to discard inter-bin correlations in T2K data; (b) direct MCMC is infeasible — a single GENIE simulation can take days to months; (c) next-generation experiments such as DUNE will face larger parameter spaces and more complex datasets.

Key Challenge: Accurate probabilistic inference with uncertainty quantification is required, yet physical simulators are expensive and the parameter space is high-dimensional.

Goal: To validate whether SBI combined with NPE can replace traditional likelihood fitting and achieve amortized inference at low training cost.

Key Insight: The MicroBooNE Tune — a 4-parameter tuning problem with known results — serves as the test scenario, with SBI correctness verified on mock data.

Core Idea: An embedding network compresses the 58-dimensional histogram into a 24-dimensional summary feature, which is then fed into a Masked Autoregressive Flow for NPE; a single training run supports unlimited fast inference.

Method¶

Overall Architecture¶

The GENIE simulator takes 4 physical parameters as input → NUISANCE generates a 58-bin histogram → an embedding network compresses it to 24 dimensions → NPE (MAF architecture) learns the inverse mapping from histogram to parameters → after training, inference is completed in seconds.

Key Designs¶

Data Generation:
- Function: Construct a large-scale training set covering the parameter space.
- Mechanism: Four parameters — MaCCQE \(\in [0.961, 1.39]\) GeV, NormCCMEC \(\in [1.0, 3.0]\), XSecShape_CCMEC \(\in [0.0, 1.0]\), RPA_CCQE \(\in [0.0, 1.0]\) — are uniformly sampled; GENIE and NUISANCE generate corresponding 58-bin histograms in T2K Analysis I format.
- Scale: 200K training + 1K test samples, within the parameter range near the MicroBooNE Tune.
Embedding Network:
- Function: Dimensionality reduction and informative summary extraction.
- Mechanism: A 3-layer neural network compresses the 58-bin histogram to a 24-dimensional summary feature. The choice of 24 dimensions (rather than fewer) is motivated by the observation that excessively low dimensionality leads to overconfident posteriors; 24 dimensions are found to be a stable choice that maintains calibration.
- Design Motivation: Direct NPE on the 58-dimensional raw input is less efficient; summary features capture the most informative statistics.
Neural Posterior Estimation (NPE with MAF):
- Function: Learn the posterior distribution \(p(\theta | x)\).
- Mechanism: Masked Autoregressive Flow architecture with 6 transformation layers and 55 hidden features per layer. The embedding network and MAF are jointly trained end-to-end.
- Design Motivation: MAF can model complex multi-modal posteriors and inter-parameter correlations; joint training ensures the embedding is optimal for the inference task.

Loss & Training¶

Objective: Negative log-likelihood.
Training configuration: batch size = 512, lr = 1e-2, 90/10 train/val split.
Early stopping: patience = 45 epochs; convergence at ~150 epochs on average.
Training time: ~10 minutes (CPU).
Inference time: seconds (amortized inference — train once, infer unlimited times).

Key Experimental Results¶

Main Results — Posterior Coverage and Parameter Recovery¶

Metric	Result
Residual center	Centered at 0 for all 4 parameters; no systematic bias
Residual width	Narrow distribution, low variance
\(\theta_1\) (MaCCQE) coverage	Within 10% tolerance band
\(\theta_2\) (NormCCMEC) coverage	Within 10% tolerance band
\(\theta_3\) (XSecShape) coverage	Within 20% tolerance band (slightly overconfident)
\(\theta_4\) (RPA_CCQE) coverage	Within 20% tolerance band

MicroBooNE Tune Parameter Recovery¶

Parameter	MicroBooNE Ground Truth	SBI Inferred Value (\(1\sigma\))	Match
MaCCQE	MicroBooNE reported value	Near-perfect match	Excellent
NormCCMEC	MicroBooNE reported value	Near-perfect match	Excellent
XSecShape_CCMEC	MicroBooNE reported value	Near-perfect match	Excellent
RPA_CCQE	MicroBooNE reported value	Near-perfect match	Excellent

Key Findings¶

All 4 posteriors are unbiased: Residuals over 1,000 test events are centered at 0, demonstrating no systematic estimation bias.
Weak inter-parameter correlations: Single-event posteriors show near-independence among the 4 parameters, though mild correlations appear across the full test sample.
\(\theta_3\) is slightly overconfident: Coverage tests reveal that predicted confidence intervals are too narrow for this parameter; the remaining parameters are slightly underconfident (more conservative).
Key validation passed: Mock data from the MicroBooNE Tune can be accurately recovered, establishing the foundation for application to real experimental data.

Highlights & Insights¶

First application of SBI to neutrino interaction model tuning: Establishes a methodological precedent and paves the way for next-generation experiments such as DUNE.
Practical value of amortized inference: 10-minute training → second-level inference, representing a ~\(10^6\)-fold efficiency gain over MCMC combined with GENIE (months per run).
No need to discard data correlations: The original MicroBooNE tuning discarded inter-bin correlations; SBI naturally avoids this workaround.

Limitations & Future Work¶

Only a 4-dimensional parameter space: Next-generation tuning may involve tens of parameters; scalability has not been validated.
Mock data validation only: The method has not yet been applied to real experimental data (T2K measurements); noise and systematic uncertainties in real data may introduce new challenges.
Overconfidence issue for \(\theta_3\): Improved calibration methods are needed, potentially via ensembling or better network architectures.
Incomplete uncertainty treatment: Full correlated uncertainty propagation on inputs and outputs is not addressed.

vs. MicroBooNE original tuning: MicroBooNE used simple likelihood fitting while discarding bin correlations; SBI provides a full posterior distribution with uncertainty quantification.
vs. JUNO SBI (Gavrikov2025): JUNO applied SBI to detector response tuning; this work is the first to target physical interaction model parameters.
vs. collider physics SBI: SBI is already mature in the collider domain (Higgs potential, CP violation); neutrino physics represents a new application area.

Rating¶

Novelty: ⭐⭐⭐⭐ Novel as a first application to this domain, though the SBI+NPE methodology itself is well established.
Experimental Thoroughness: ⭐⭐⭐ Coverage tests and MicroBooNE validation are adequate, but limited to mock data.
Writing Quality: ⭐⭐⭐⭐ Problem motivation is clear, with a well-balanced presentation of neutrino physics background and ML methodology.
Value: ⭐⭐⭐⭐ Directly useful to the neutrino experimental community; lays groundwork for DUNE and beyond.