Encoding and Understanding Astrophysical Information in Large Language Model-Generated Summaries¶
Conference: NeurIPS 2025 arXiv: 2511.14685 Code: None Area: Physics / LLM Scientific Reasoning Keywords: LLM embeddings, astrophysics, sparse autoencoders, X-ray astronomy, physical encoding
TL;DR¶
This work investigates whether LLM embeddings encode physically meaningful quantities derived from X-ray astronomical observations—specifically hardness ratios, power-law indices, and variability indices. Results show that structured prompt design improves clustering purity of physical attributes by 5.9%–57.5%, and sparse autoencoders reveal that LLMs infer physical parameters not explicitly stated by recognizing object types.
Background & Motivation¶
Background: LLMs have demonstrated cross-domain generalization capabilities, yet it remains unclear whether their embeddings encode physical attributes directly measured from scientific observations—rather than merely textual descriptions. Astrophysics provides an ideal testbed: X-ray sources are characterized by precise physical measurements (from the Chandra Source Catalog) and rich literature descriptions.
Limitations of Prior Work: (a) The scientific reasoning capabilities of LLMs are difficult to quantitatively assess; (b) it is unclear how prompt design affects the encoding of physical information; (c) the semantic pathways through which physical concepts are represented in LLM internal representations remain opaque.
Key Challenge: Do LLMs genuinely "understand" physics, or have they merely learned statistical correlations? How can physical information be disentangled from embedding spaces?
Goal: (1) Does prompt design affect how LLMs encode physical quantities? (2) Which aspects of language are most important for encoding physical information?
Key Insight: GPT-4o-mini is used to generate textual summaries for 4,000 X-ray sources from NASA ADS papers; ada-002 is used to produce embeddings; the clustering quality of physical attributes in embedding space is then measured.
Core Idea: KNN purity is used to measure the encoding quality of physical attributes in embeddings, while sparse autoencoders are employed to trace semantic pathways—revealing that LLMs infer physical parameters through object-type recognition.
Method¶
Overall Architecture¶
A three-step pipeline: (1) structured source descriptions are generated from astronomical papers using GPT-4o-mini; (2) embeddings are encoded via ada-002; (3) physical attribute clustering is evaluated using KNN purity, and semantic features are analyzed using a sparse autoencoder (SAE).
Key Designs¶
-
Prompt Engineering (Baseline vs. Optimized):
- Function: Compare the effect of different prompt designs on physical encoding.
- Mechanism: The baseline prompt simply requests a summary of physical attributes; the optimized prompt includes structured formatting instructions, rules for handling missing information, and explicit guidance to avoid uninformative text.
- Design Motivation: Different prompts direct the LLM's attention to different information, directly influencing the physical content encoded in the resulting embeddings.
-
KNN Purity Evaluation:
- Function: Quantify the clustering quality of physical attributes in embedding space.
- Mechanism: For each source, \(K=10\) nearest neighbors are identified, and the proportion of neighbors sharing similar physical attributes is computed. Higher purity indicates better encoding of physical attributes.
- Design Motivation: Directly measures whether physical information is preserved in the embeddings.
-
Sparse Autoencoder (SAE) Feature Analysis:
- Function: Reveal the semantic pathways through which physical information is encoded in LLM embeddings.
- Mechanism: A pretrained SAE extracts monosemantic features from ada-002 embeddings; the analysis identifies which tokens activate which features; Claude/Gemini is used to annotate the semantic meaning of the most strongly activated features.
- Design Motivation: SAE provides an interpretable feature decomposition that exposes the physical reasoning mechanisms of LLMs.
Loss & Training¶
The SAE is trained with standard sparsity constraints. The main experiments involve no model training—the pipeline is purely evaluative.
Key Experimental Results¶
Main Results¶
| Physical Quantity | Baseline Prompt Purity | Optimized Prompt Purity | Gain |
|---|---|---|---|
| Hardness Ratio | 0.7998 | 0.8468 | +5.9% |
| Power-Law Index \(\gamma\) | 0.8185 | 0.9418 | +15.1% |
| Variability Index | 0.6346 | 0.9994 | +57.5% |
SAE Feature Analysis¶
| SAE Feature ID | Semantic Label | Associated Physical Quantity |
|---|---|---|
| Feature 1212 | "Scientific inference and likelihood" | Source classification |
| Feature 839 | "X-ray binary burst periodicity" | Variability |
| Feature 2047 | "Non-thermal emission spectrum" | Power-law index |
Key Findings¶
- Physical encoding without explicit numerical values: The high clustering purity of the power-law index \(\gamma\) (0.94) arises from object-type descriptions rather than numerical values—the model encodes physical properties through a causal reasoning chain of "QSO-type → non-thermal spectrum → specific \(\gamma\) range."
- Substantial impact of prompt design: The variability index purity increases from 0.63 to 0.9994 (+57.5%), demonstrating that the degree of prompt structuring directly controls the retention of physical information.
- SAE reveals compositional reasoning: Rather than directly memorizing physical values, the model infers them indirectly through semantic pathways such as object-type recognition and spectral feature association.
Highlights & Insights¶
- Semantic pathways for physical encoding: LLMs encode physical information through hierarchical reasoning of the form "object type → physical characteristics → parameter inference," which goes beyond simple statistical correlation.
- Prompts as physical information selectors: Structured prompt design is equivalent to selecting which physical dimensions the LLM "attends to."
- SAE as a tool for scientific interpretability: This work introduces a new instrument for understanding the scientific reasoning processes of LLMs.
Limitations & Future Work¶
- The study is limited to X-ray astrophysics; generalization to other scientific domains (e.g., particle physics, chemistry) has not been validated.
- It is not possible to definitively distinguish between "understanding physics" and "learning statistical correlations present in training data."
- Evaluation relies on a single embedding model (ada-002); other models may exhibit different behavior.
Related Work & Insights¶
- vs. Traditional Scientific ML: Conventional approaches directly model physical equations, whereas this work explores the implicit physical encoding capabilities of LLMs.
- Insights: The SAE + KNN evaluation framework is generalizable to assessing domain-specific knowledge encoding in LLMs across any scientific field.
Rating¶
- Novelty: ⭐⭐⭐⭐ First systematic evaluation of astrophysical information encoding in LLM embeddings; SAE-based analysis is novel.
- Experimental Thoroughness: ⭐⭐⭐ Dataset of 4,000 sources is moderate in scale, but limited to a single scientific domain.
- Writing Quality: ⭐⭐⭐⭐ Cross-disciplinary exposition is clear; figures and tables are intuitive.
- Value: ⭐⭐⭐⭐ Introduces a new methodology for evaluating LLM knowledge encoding in the context of AI for Science.