Discovering a Shared Logical Subspace: Steering LLM Logical Reasoning via Alignment of Natural-Language and Symbolic Views¶

Conference: ACL 2026 arXiv: 2604.19716 Code: https://github.com/lei-nlp-lab/logical_subspace_acl_2026 Area: Human Understanding / LLM Reasoning Keywords: logical reasoning, multi-view subspace, activation steering, CCA alignment, training-free inference

TL;DR¶

This work identifies a shared logical subspace within LLMs that simultaneously aligns natural-language and symbolic-logic reasoning representations. Steering activations along this subspace at inference time improves logical reasoning accuracy by up to 11 percentage points without any model training.

Background & Motivation¶

State of the Field: LLMs continue to struggle with complex multi-step logical reasoning. Existing approaches fall into two camps: (1) natural-language-based methods that optimize chain-of-thought reasoning via prompting or fine-tuning, and (2) neuro-symbolic methods that attach external symbolic solvers or verifiers.

Limitations of Prior Work: The first category optimizes reasoning chains solely in natural-language form, neglecting the structured information offered by symbolic logic. The second category relies on external symbolic components, increasing system complexity and maintenance overhead. Neither camp investigates whether a unified internal representation of logical reasoning exists within LLMs.

Root Cause: The same logical reasoning problem can be expressed through two complementary representations—natural-language proofs and symbolic proofs—yet existing methods either focus on one representation or require external tools to bridge the two.

Paper Goals: To determine whether a shared logical subspace exists inside LLMs that aligns both natural-language and symbolic views of reasoning, and to leverage this subspace to enhance reasoning capability.

Starting Point: Residual-stream activations from paired natural-language and symbolic proofs are used to learn a low-dimensional shared subspace via Canonical Correlation Analysis (CCA).

Core Idea: A low-dimensional logical subspace exists in the residual stream of LLMs, capturing logical reasoning capabilities shared across natural-language and symbolic representations. Amplifying the projection of activations onto this subspace at inference time enhances reasoning without modifying model weights.

Method¶

Overall Architecture¶

The approach consists of two stages: (1) learning the multi-view logical subspace—collecting residual activations from paired NL/symbolic reasoning chains and applying PCA+CCA to learn a low-dimensional subspace that maximizes cross-view correlation; and (2) inference-time steering—amplifying each token's activation projection along the learned subspace directions during the forward pass, biasing generation toward logical reasoning.

Key Designs¶

PCA+CCA Subspace Learning:
- Function: Learns a shared logical subspace from paired NL and symbolic reasoning activations.
- Mechanism: PCA is first applied for denoising and compression (retaining 98% of variance), followed by CCA to identify the \(k=32\) directions of maximum correlation between the NL and symbolic activation spaces. An orthonormal basis \(U^{(\ell)} \in \mathbb{R}^{D \times k}\) is obtained via QR decomposition.
- Design Motivation: CCA maximizes cross-view correlation, ensuring the subspace captures logical structure shared across surface forms rather than information specific to either language modality.
Inference-Time Activation Steering:
- Function: Enhances CoT reasoning without modifying model weights.
- Mechanism: At a selected layer \(\ell^*\), the residual vector is replaced as \(\tilde{h}^{(\ell^*)}_t = h^{(\ell^*)}_t + \lambda \frac{P^{(\ell^*)} h^{(\ell^*)}_t}{\|P^{(\ell^*)} h^{(\ell^*)}_t\|_2} \|h^{(\ell^*)}_t\|_2\), adding a normalized perturbation along the subspace projection direction.
- Design Motivation: Only a one-time subspace estimation and a single matrix–vector multiplication per token are required, making inference overhead negligible (179 → 176 tok/s).
Compatibility with Inference Schemes:
- Function: Can be stacked on top of few-shot CoT and self-consistency.
- Mechanism: The same subspace, steering layer, and \(\lambda\) are reused directly without re-tuning.
- Design Motivation: LSS operates at the activation level and is orthogonal to prompt-level and sampling-level methods, enabling straightforward composition.

Loss & Training¶

This is a training-free method. Subspace learning requires only a single PCA+CCA estimation on gold-standard proofs. The steering strength \(\lambda\) and steering layer \(\ell^*\) are selected on a validation set.

Key Experimental Results¶

Main Results¶

Model	Benchmark	Greedy-CoT	LSS-CoT	Gain
Llama-3.1-8B	FOLIO	51.7%	61.1%	+9.4
Llama-3.1-8B	PrOntoQA (5-hop)	70.6%	75.4%	+4.8
Phi-3-Mini	PrOntoQA (5-hop)	59.6%	70.6%	+11.0
Gemma-2-9B	PrOntoQA (5-hop)	87.4%	90.2%	+2.8
Gemma-2-9B	PW-CWA (3-hop)	71.4%	73.8%	+2.4

Stacking with Inference Schemes (Llama-3.1-8B, PrOntoQA)¶

Method	Accuracy
Greedy-CoT	70.6%
3-shot-CoT + LSS	74.6% (+2.2 over 3-shot)
SC-3 + LSS	81.0% (+2.0 over SC-3)

Ablation Study¶

Configuration	Key Metric	Note
Steering with random orthogonal directions	No gain / performance drop	Confirms gains stem from the learned logical subspace, not arbitrary activation amplification
\(\lambda\) sensitivity	Optimal \(\lambda\) varies by model	Logical subspace directions yield robust gains; random directions show no consistent improvement
Qwen3-4B (reasoning-specialized model)	87.2 → 93.2 (+6.0)	Even strong base models benefit from LSS

Key Findings¶

The logical subspace encodes both semantic and logical structural information.
Alignment between the NL and symbolic views is stronger in the higher layers of LLMs.
Projection energy \(E^{(\ell)}(r)\) is positively correlated with reasoning correctness.
Steering causes models to use more logical connectives (e.g., since, so) and fewer vague reasoning verbs (e.g., think, know, assume).
LSS functions as a stabilizer for weaker models: SC-3 even degrades performance on Llama-3.2-3B, whereas LSS yields consistent improvements.

Highlights & Insights¶

This is the first work to identify and exploit a logical subspace shared across natural-language and symbolic representations within LLMs, offering important insights into the internal mechanisms of LLM reasoning.
The method is extremely lightweight: no training, no external tools, negligible inference overhead, requiring only a single matrix–vector multiplication per token.
A third paradigm for enhancing LLM reasoning is proposed: rather than extending context length or sampling budget, internal representations are directly aligned at the activation level.
Orthogonality with few-shot CoT and self-consistency enables straightforward composition, demonstrating broad methodological compatibility.

Limitations & Future Work¶

Paired NL and symbolic proofs are required to learn the subspace, limiting applicability to tasks without a symbolic formalization (FOLIO addresses this via NL–FOL alignment as a substitute).
The optimal steering layer and strength vary across model–task pairs, necessitating validation-set tuning.
The subspace dimensionality is fixed at \(k=32\); adaptive dimensionality selection has not been explored.
Future work may investigate cross-task transfer, integration with reasoning-oriented training, and applicability to broader reasoning types.

vs. RepE / Activation Engineering: General-purpose activation steering methods; this work specializes in logical reasoning, exploiting NL–symbolic alignment to learn more precise steering directions.
vs. Neural-Symbolic Methods: Traditional approaches attach external symbolic solvers, whereas this work fuses the two views directly at the level of internal representations.
vs. Self-Consistency: SC improves reasoning through majority voting over multiple samples; this work achieves comparable effects via single-pass steering at substantially lower computational cost.

Rating¶

Novelty: ⭐⭐⭐⭐⭐ First to discover and exploit a multi-view logical subspace within LLMs; conceptually original.
Experimental Thoroughness: ⭐⭐⭐⭐ Four benchmarks, five models, extensive ablations and analyses.
Writing Quality: ⭐⭐⭐⭐⭐ Clear motivation, rigorous mathematical derivations, and in-depth analysis.
Value: ⭐⭐⭐⭐ Introduces a new paradigm for enhancing LLM reasoning with both theoretical and practical significance.