Skip to content

⚛️ Physics

🧠 NeurIPS2025 · 20 paper notes

AstroCo: Self-Supervised Conformer-Style Transformers for Light-Curve Embeddings

This paper proposes AstroCo, a self-supervised encoder that introduces the Conformer architecture (attention + depthwise separable convolution + gating) for irregular astronomical light curves. On the MACHO dataset, AstroCo reduces reconstruction error by 61–70% compared to Astromer v1/v2 and improves few-shot classification macro-F1 by approximately 7%.

Dynamic Diffusion Schrödinger Bridge in Astrophysical Observational Inversions

This paper proposes Astro-DSB, a Diffusion Schrödinger Bridge-based framework for modeling astrophysical inverse problems. It directly learns a probabilistic mapping from observables to true physical distributions, requires only 25% of the training cost of conditional DDPM, demonstrates significant generalization advantages in out-of-distribution (OOD) evaluation, and is successfully applied to real observational data from Taurus B213.

Exoplanet Formation Inference Using Conditional Invertible Neural Networks

A conditional invertible neural network (cINN) trained on 15,777 synthetic planets infers planet formation parameters (disk mass, turbulent \(\alpha\), dust-to-gas ratio) from observables (planet mass, orbital distance), achieving probabilistic parameter retrieval ~10⁶× faster than physical simulations. Multi-planet system data is shown to yield more robust inference than single-planet data.

FAIR Universe HiggsML Uncertainty Dataset and Competition

This work provides a standardized dataset of 280 million simulated LHC collision events and a competition platform featuring six parameterized systematic biases (detector calibration + background composition) alongside an asymmetric coverage penalty metric. Participants are required to construct robust 68.27% confidence intervals for the Higgs signal strength \(\mu\). The winning solutions, based on profile-free surrogate modeling, achieve confidence intervals approximately 20% narrower than conventional binned methods.

FEAT: Free Energy Estimators with Adaptive Transport

This paper proposes the FEAT framework, which employs stochastic interpolants to learn transport maps between two thermodynamic systems. Building on the escorted Jarzynski equality and the controlled Crooks theorem, FEAT provides consistent, minimum-variance free energy difference estimators along with variational upper and lower bounds, thereby unifying equilibrium and non-equilibrium approaches.

From Simulations to Surveys: Domain Adaptation for Galaxy Observations

This work constructs a domain adaptation pipeline from simulated galaxies (TNG50) to real survey observations (SDSS) via feature-level alignment using Euclidean distance, optimal transport, and a top-\(k\) soft-matching loss with trainable weight scheduling, improving target-domain morphology classification accuracy from 46.8% (no adaptation) to 87.3%, and Macro F1 from 0.298 to 0.626.

Knowledge is Overrated: A Zero-Knowledge ML and Cryptographic Hashing-Based Framework for Verifiable, Low Latency Inference at the LHC

This paper proposes PHAZE, a framework that combines cryptographic hashing (Rabin fingerprinting) and zero-knowledge machine learning (zkML) to enable verifiable early-exit inference at LHC trigger latency, achieving a theoretical online latency of ~152–253 ns while providing built-in anomaly detection capability.

Latent Representation Learning in Heavy-Ion Collisions with MaskPoint Transformer

This work introduces a masked point cloud Transformer autoencoder to heavy-ion collision analysis. Through a two-stage paradigm of self-supervised pre-training followed by supervised fine-tuning, the model learns nonlinear latent representations substantially stronger than those of PointNet—reducing PC1 distribution overlap from 2.42% to 0.27%—providing a general feature learning framework for studying QGP properties.

Multi-Modal Masked Autoencoders for Learning Image-Spectrum Associations for Galaxy Evolution and Cosmology

This paper applies a Multimodal Masked Autoencoder (MMAE) to jointly model galaxy images (HSC-PDR2, five bands) and spectra (DESI-DR1), constructing a cross-modal dataset GalaxiesML-Spectra of 134,533 galaxies. Under a 75% masking ratio, the model reconstructs major spectral emission lines and image morphology. When spectra are entirely absent at inference, the model achieves \(\sigma_{\text{NMAD}}=0.016\) for redshift prediction using images alone, outperforming AstroCLIP while extending the redshift range to \(z \sim 4\) for the first time.

Neural Deprojection of Galaxy Stellar Mass Profiles

A neural network approach is proposed to map Nuker galaxy profile parameters to analytically deprojectable Multi-Gaussian Expansion (MGE) components, enabling stellar mass modeling of galaxies without optical imaging. The method is integrated into the differentiable dynamical modeling pipeline SuperMAGE for Bayesian inference of supermassive black hole (SMBH) masses.

POLARIS: A High-contrast Polarimetric Imaging Benchmark Dataset for Exoplanetary Disk Representation Learning

This work introduces POLARIS, the first ML benchmark dataset for exoplanetary polarimetric imaging (921 VLT/SPHERE/IRDIS polarimetric images + 75,910 preprocessed exposures), and proposes the Diff-SimCLR framework (diffusion-augmented contrastive learning), achieving 93% accuracy on the reference-star vs. target-star classification task with fewer than 10% manual annotations.

Quantum Doubly Stochastic Transformers

This paper proposes QDSFormer (Quantum Doubly Stochastic Transformer), replacing softmax with a variational quantum circuit QontOT to generate doubly stochastic attention matrices. Both theoretical analysis and experiments demonstrate that quantum-circuit-generated DSMs are more diverse and better at preserving information, consistently outperforming standard ViT and Sinkformer on multiple small-scale visual recognition tasks.

Simulation-Based Inference for Neutrino Interaction Model Parameter Tuning

This work presents the first application of simulation-based inference (SBI) to neutrino interaction model parameter tuning. Using neural posterior estimation (NPE), the method learns the posterior distribution of 4 physical parameters from 200K GENIE-simulated 58-bin histograms, and accurately recovers the ground-truth parameter values on mock data from the MicroBooNE Tune.

The Pareto Frontier of Resilient Jet Tagging

This work systematically evaluates the AUC–resilience trade-off across multiple architectures (DNN/PFN/EFN/ParT) for LHC jet tagging tasks, revealing that more complex models achieve higher AUC but exhibit stronger Monte Carlo model dependence. A Pareto frontier is constructed, and a case study demonstrates that low-resilience classifiers introduce bias in downstream parameter estimation even after calibration.

The Platonic Universe: Do Foundation Models See the Same Sky?

This paper validates the Platonic Representation Hypothesis (PRH) in an astronomical setting. Using JWST, HSC, Legacy Survey, and DESI spectroscopic data, it measures representation alignment across six foundation models (ViT/ConvNeXt/DINOv2/IJEPA/AstroPT/Specformer) and finds that both intra-modal and cross-modal MKNN scores consistently increase with model scale (\(p = 3.31 \times 10^{-5}\)), supporting the hypothesis that models of different architectures and modalities converge toward a shared representation.

TITAN: A Trajectory-Informed Technique for Adaptive Parameter Freezing in Large-Scale VQE

This paper proposes TITAN, a framework that employs deep learning models to predict "frozen parameters" in VQE—parameters that remain inactive throughout training—enabling 40–60% of parameters to be frozen at initialization, achieving up to 3× convergence speedup and 40–60% reduction in circuit evaluations, while matching or surpassing baseline accuracy on molecular systems of up to 30 qubits.

Toward Complete Merger Identification at Cosmic Noon with Deep Learning

A ResNet18 is trained on simulated HST CANDELS images generated from IllustrisTNG50, demonstrating for the first time that deep learning can successfully identify galaxy mergers at high redshift \(1<z<1.5\), including minor mergers (\(\mu \geq 1/10\)) and low-mass galaxies (\(M_\star > 10^8 M_\odot\)), achieving an overall accuracy of ~73%. Model behavior is further analyzed through Grad-CAM and UMAP.

Transfer Learning Beyond the Standard Model

This work investigates whether neural networks pre-trained on the standard cosmological model (ΛCDM) can transfer to beyond-standard-model scenarios (massive neutrinos, modified gravity, primordial non-Gaussianity). The study finds that a dummy node architecture can reduce simulation requirements by an order of magnitude, but negative transfer emerges when parameters exhibit strong physical degeneracies (e.g., \(\sigma_8\)\(M_\nu\)).

Unsupervised Discovery of High-Redshift Galaxy Populations with Variational Autoencoders

A variational autoencoder (VAE) is applied to unsupervised clustering of 2,743 JWST high-redshift (\(z>4\)) galaxy spectra, uncovering 12 distinct astrophysical categories and more than doubling the known sample sizes of rare populations including post-starburst galaxies, Lyman-α emitters, extreme emission line galaxies, and Little Red Dots.

Vision Transformers for Cosmological Fields: Application to Weak Lensing Mass Maps

This work presents the first systematic application of Vision Transformers (ViT and Swin Transformer) to constraining cosmological parameters (\(\Omega_m\) and \(S_8\)) from weak lensing convergence maps, comparing attention-based architectures against CNNs within a simulation-based inference framework.