Towards Effective In-context Cross-domain Knowledge Transfer via Domain-invariant-neurons-based Retrieval¶
Conference: ACL 2026 arXiv: 2604.05383 Code: GitHub Area: LLM Reasoning Keywords: Cross-domain knowledge transfer, domain-invariant neurons, in-context learning retrieval, reasoning structure alignment, mathematical-logical reasoning
TL;DR¶
This paper proposes DIN-Retrieval, which identifies domain-invariant neurons (DINs) in LLMs exhibiting consistent activation polarity across domains, constructs a domain-robust representational subspace for retrieving structurally compatible cross-domain demonstrations, and provides the first systematic evidence that cross-domain ICL examples can improve LLM reasoning performance, achieving an average gain of 1.8% on math-to-logic reasoning transfer.
Background & Motivation¶
State of the Field: In-context learning (ICL) enables LLMs to adapt to new tasks without parameter updates. However, existing ICL research assumes access to in-domain expert-annotated demonstrations, limiting applicability in domains with scarce expert knowledge (e.g., specialized mathematical reasoning, formal logic, legal analysis).
Limitations of Prior Work: (1) Zero-shot LLMs tend to exhibit three failure modes during reasoning—missing intermediate links, incomplete branch integration, and neglected blocking conditions; (2) Although reasoning tasks across domains differ substantially in surface semantics, they share many reusable implicit logical structures (e.g., chain reasoning, conditional branching); (3) Manually selecting structurally aligned cross-domain demonstrations is impractical given the large structural variation across tasks.
Root Cause: Cross-domain examples can restore correct reasoning topology (as demonstrated by prior work), yet no mechanism exists for automatically retrieving structurally compatible demonstrations.
Paper Goals: Develop an automated retrieval method capable of identifying ICL demonstrations from other domains that are structurally compatible with a given target query.
Starting Point: Drawing on the concept of domain-invariant features from domain adaptation theory—identifying neurons in LLM hidden layers whose activation polarity remains consistent across domains, as these neurons encode domain-agnostic reasoning structure information.
Core Idea: Discover domain-invariant neurons (DINs) within LLMs, construct domain-robust retrieval representations from their activations, and retrieve structurally aligned cross-domain demonstrations via cosine similarity.
Method¶
Overall Architecture¶
DIN-Retrieval proceeds in four steps: (A) DIN Identification—compute the z-score of each neuron over source- and target-domain samples and select neurons with consistent cross-domain polarity; (B) DIN Vector Construction—aggregate DIN activations across multiple layers into a compact representation; (C) Cross-domain Retrieval—retrieve top-\(k\) source-domain examples using cosine similarity combined with MMR in the DIN space; (D) Cross-domain CoT Inference—concatenate the retrieved source-domain examples with the target query as few-shot demonstrations for ICL inference.
Key Designs¶
-
Domain-Invariant Neuron (DIN) Identification:
- Function: Discover neurons in LLMs that encode domain-agnostic reasoning structures.
- Mechanism: For each neuron \(k\) in each layer, compute z-scores \(z_k^S\) and \(z_k^T\) over the source and target domains respectively. Select neurons whose polarity is consistent across both domains and exceeds threshold \(\tau\): \(\mathcal{I} = \{k | z_k^S > \tau \wedge z_k^T > \tau\} \cup \{k | z_k^S < -\tau \wedge z_k^T < -\tau\}\). If the budget \(K\) is exceeded, the top-\(K\) neurons are selected by \(|z_k^S| + |z_k^T|\).
- Design Motivation: Neurons with consistent cross-domain polarity are insensitive to domain shift and encode abstract features shared across domains. Pruning experiments confirm their functional importance: removing DINs causes a far greater increase in perplexity than random pruning.
-
DIN Vector Representation:
- Function: Compress high-dimensional hidden states into a domain-robust compact representation.
- Mechanism: For each layer's hidden state of input \(x\), compute the token-averaged representation \(\bar{h}^{(l)}(x)\), retain only the DIN dimensions, and concatenate across layers: \(\mathbf{v}_{\text{DIN}}(x) = \bigoplus_{l \in \mathcal{L}} h^{(l)}(x)_{\mathcal{I}^{(l)}}\).
- Design Motivation: Full hidden states contain substantial domain-specific information that interferes with cross-domain similarity computation; DIN-filtered representations focus on reasoning structure rather than surface semantics.
-
MMR Diversity-Aware Retrieval:
- Function: Retrieve cross-domain demonstrations that are both structurally aligned and diverse.
- Mechanism: \(\text{Score}(i) = \lambda \cdot \cos(\mathbf{v}_q, \mathbf{v}_i) - (1-\lambda) \cdot \max_{j \in \mathcal{S}} \cos(\mathbf{v}_i, \mathbf{v}_j)\), balancing relevance to the query against redundancy among already-selected examples. The default setting retrieves \(k=2\) source-domain examples.
- Design Motivation: Prevents retrieval of structurally redundant examples—demonstrations covering diverse reasoning patterns provide LLMs with more comprehensive structural cues.
Loss & Training¶
DIN-Retrieval requires no training. DIN identification relies on activation statistics, and retrieval is based on cosine similarity. Experiments are conducted using LLaMA-3.1-8B, Gemma-3-12B/27B, and Qwen2.5/3-7B–32B.
Key Experimental Results¶
Main Results¶
Cross-domain reasoning accuracy (average over four transfer directions)
| Method | Qwen2.5-7B | Qwen3-8B | Gemma-3-27B |
|---|---|---|---|
| Zero-shot | 84.6 | 91.8 | 88.75 |
| X-ICL (embedding retrieval) | 83.4 | 91.2 | — |
| DIN-Retrieval | 86.8 | 93.1 | 90.3 |
Ablation Study¶
DIN vs. random neuron selection (GSM8K→FOLIO)
| Model | DIN Acc. | Random Acc. | Diff. |
|---|---|---|---|
| LLaMA-3.1-8B | 62.7 | 60.3 | +2.4 |
| Qwen2.5-7B | 62.8 | 59.5 | +3.3 |
| Qwen3-8B | 85.5 | 84.0 | +1.5 |
Key Findings¶
- DIN-Retrieval consistently outperforms both zero-shot and embedding-based cross-domain ICL across all models and transfer directions.
- DIN pruning causes a substantially larger perplexity increase than random pruning, confirming the functional importance of DINs.
- This work provides the first systematic evidence that cross-domain ICL demonstrations can improve LLM reasoning—challenging the assumption that ICL must rely on in-domain examples.
- Bidirectional transfer is effective: both GSM8K→FOLIO (math→logic) and FOLIO→GSM8K (logic→math) yield gains.
- Although the improvement margin is modest (average 1.8%), it is statistically significant and consistent.
Highlights & Insights¶
- The insight that "different domains share reasoning structures" carries broad implications—reasoning capabilities are not domain-specific but transferable across domains.
- The discovery of DINs offers a new perspective on understanding reasoning representations inside LLMs, revealing the existence of specialized neurons that encode domain-agnostic reasoning patterns.
- The method is elegant and lightweight—requiring no training, relying solely on activation statistics and cosine similarity.
Limitations & Future Work¶
- The average gain of 1.8% is modest; for strong zero-shot baselines, headroom for improvement is already limited.
- Validation is confined to bidirectional math–logic transfer; extension to broader domain pairs (e.g., legal→medical) remains unexplored.
- DIN identification requires unlabeled samples from both source and target domains to compute z-scores, making the approach not fully zero-resource.
- The selection of threshold \(\tau\) and neuron ratio \(k_{\text{ratio}}\) lacks an adaptive mechanism.
Related Work & Insights¶
- vs. X-ICL (embedding retrieval): X-ICL retrieves using full hidden-state embeddings, which contain domain-specific noise; DIN filtering focuses the representation on structural information.
- vs. in-domain ICL: In-domain demonstrations are generally superior when available, but this work demonstrates that cross-domain examples remain effective when in-domain annotations are unavailable.
- vs. domain adaptation (DANN, etc.): Classical domain adaptation methods require training; DIN-Retrieval is entirely training-free—transferring the domain-invariant feature concept from training-time adaptation to inference-time retrieval.
Rating¶
- Novelty: ⭐⭐⭐⭐ First systematic study of cross-domain ICL; the discovery of DINs carries theoretical significance.
- Experimental Thoroughness: ⭐⭐⭐⭐ Multiple models × multiple transfer directions, DIN existence validation, and statistical significance testing.
- Writing Quality: ⭐⭐⭐⭐ The motivational chain from failure mode analysis to method design is clear and well-structured.
- Value: ⭐⭐⭐⭐ Provides a new direction for ICL in domains with scarce expert knowledge.