Induce, Align, Predict: Zero-Shot Stance Detection via Cognitive Inductive Reasoning¶

Conference: AAAI 2026 arXiv: 2506.13470 Code: None Area: Interpretability Keywords: zero-shot stance detection, cognitive schema, first-order logic, graph kernel, low-resource

TL;DR¶

This paper proposes the CIRF framework, which abstracts transferable reasoning patterns from LLM-generated first-order logic via unsupervised schema induction (USI), and performs explainable zero-shot stance reasoning through structural alignment using a schema-enhanced graph kernel model (SEGKM). The method achieves state-of-the-art performance on three benchmarks while requiring only 30% of labeled data.

Background & Motivation¶

Background: Zero-shot stance detection (ZSSD) requires inferring the stance of text toward targets unseen during training, which is critical for analyzing rapidly emerging polarized social media topics.

Limitations of Prior Work: - LLM zero-shot prompting underperforms on complex reasoning with limited generalization (GPT-3.5 achieves only 69.8 F1 on SEM16) - LLM-augmented fine-tuning methods (KAI, FOLAR, etc.) still require substantial labeled data and remain at instance-level pattern matching - Both paradigms lack explainability and cross-target reasoning generalization

Key Challenge: Stance detection requires abstract reasoning beyond surface-level lexical matching (e.g., "increasing health risks" and "undermining economic stability" both instantiate the reasoning pattern "negative consequence → opposition"), yet existing methods either perform surface-level matching or rely heavily on annotations.

Key Insight: Schema theory in cognitive science — humans induce generalizable reasoning patterns (schemas) from concrete experiences and apply them to new contexts. This cognitive capability is formalized as unsupervised induction of first-order logic patterns with graph kernel alignment.

Method¶

Overall Architecture¶

CIRF consists of two core modules: (1) USI (Unsupervised Schema Induction): LLM generates FOL reasoning → interpretation abstraction → clustering into schema graphs; (2) SEGKM (Schema-Enhanced Graph Kernel Model): constructs FOL graphs from input → subgraph kernel matching against schema templates → hierarchical graph representation → stance prediction.

Key Designs¶

1. Unsupervised Schema Induction (USI)

Function: Unsupervisedly induces abstract, cross-target transferable reasoning patterns from raw text
Design Motivation: Instance-level FOL rules cannot generalize across domains; higher-level reasoning abstraction is needed
Mechanism (four-stage pipeline):
- FOL Reasoning Generation: For each sentence-target pair, the LLM is prompted to generate a first-order logic reasoning chain
- FOL Interpretation and Abstraction: The LLM is prompted to analyze the internal logic of FOL, generate logically equivalent but structurally diverse variants, and then summarize them into generalized templates. Example: ∀x, (is_robot(x) → (helps_humans(x) → must_be_safe(x))) is abstracted to ∀x, ((is_target(x) ∧ meets_condition(x)) → entails_consequence(x))
- Schema Clustering and Hierarchical Abstraction: FOL templates are clustered by semantic and reasoning pattern similarity; large clusters are processed via a hierarchical strategy (sub-cluster splitting → intermediate chaining → merging into schema templates)
- Schema Graph Construction: Induced schemas serve as nodes in a multi-relational graph, with edges representing logical relations such as causality, contrast, and entailment

2. Schema-Enhanced Graph Kernel Model (SEGKM)

Function: Leverages schema knowledge to enhance the representation of input reasoning structures for explainable zero-shot inference
Design Motivation: Standard GNNs rely on local message passing and struggle to capture reusable high-order reasoning motifs; graph kernels enable better generalization through explicit structural matching
Mechanism:
- FOL Graph Construction: For each input pair \((x, q)\), a FOL reasoning chain is generated and parsed into a FOL graph \(G_f=(V_f, E_f)\), where nodes are predicates and edges are logical relations
- Schema Subgraph Filters: \(k\)-hop subgraphs centered at each node are extracted from schema graph \(G^{(j)}\) to form a filter pool \(\mathcal{H} = \bigcup_j H^{(j)}\)
- Relation-Aware Node Embedding: Nodes and edges are initialized with BERT embeddings; edge semantics are fused via relational projection: \(x' = \text{ReLU}(x + \text{Proj}(e))\)
- Deep Graph Kernel Response: The \(p\)-step random walk kernel between an input subgraph and a schema filter is computed as: \(\phi_{1,i}(v) = K_p(G_v^f, H_i^{(j)}) = \mathbf{s}^\top W A^{\times p} \mathbf{s}\), where \(\mathbf{s} = \text{vec}(X_{G_v^f}' \cdot (X_{H_i^{(j)}}')^\top)\)
- Schema Graph-Level Selection: Kernel responses across all nodes are aggregated to select the top-\(g\) schema graphs: \(S^{(j)} = \sum_{v \in V_f} \frac{1}{|H^{(j)}|} \sum_{H_i^{(j)} \in H^{(j)}} \phi_{1,i}(v)\)
- Hierarchical Graph Representation: Multi-layer stacking of kernel feature extraction yields the final graph representation: \(\Phi(G_f) = \text{Concat}(\sum_{v \in G_f} \phi_l(v) \mid l=0,1,...,L)\)

3. Stance Prediction

The final graph representation is fed into a fully connected ReLU layer for three-class classification (Favor/Against/None), trained end-to-end with cross-entropy loss.

Loss & Training¶

Loss function: Cross-entropy loss
Optimizer: AdamW, batch size 32, learning rate \(5 \times 10^{-4}\)
Early stopping (patience = 10), up to 20 epochs, validated every 0.2 epochs
Schema induction is fully unsupervised; SEGKM is trained on source targets
Hardware: Single 40GB A100 GPU
Default LLM: GPT-3.5; GPT-4o and DeepSeek-v3 are also evaluated

Key Experimental Results¶

Main Results¶

Zero-shot stance detection results (Macro-F1) on SEM16, VAST, and COVID-19:

Method	Type	SEM16 HC	SEM16 FM	SEM16 LA	VAST All	COVID AF	COVID WA
JointCL	BERT fine-tune	54.4	54.0	50.0	71.2	57.6	63.1
GPT-3.5	LLM prompting	78.9	68.3	62.3	65.1	69.2	57.8
COLA	LLM prompting	81.7	63.4	71.0	73.0	65.7	73.9
KAI	LLM-augmented	76.4	73.7	69.4	76.3	-	-
LCDA	LLM-augmented	79.8	70.0	69.4	80.3	-	-
FOLAR	LLM-augmented	81.9	71.2	69.9	77.2	69.5	73.1
LogiMDF	LLM-augmented	75.1	67.9	68.0	76.7	70.4	75.4
CIRF	Schema	80.1	74.7	73.9	80.9	74.1	81.0
CIRF (GPT-4o)	Schema	83.2	80.4	78.2	82.8	84.9	89.4

Average Macro-F1: 76.2 on SEM16 (+1.9 over FOLAR), 80.9 on VAST (+0.6 over LCDA), +3.7 over LogiMDF on COVID-19.

Ablation Study¶

Variant	Effect
w/o Schema	Largest performance drop; removing cognitive schemas severely degrades cross-target generalization
w/o SEGKM	Large performance drop; graph kernel alignment is critical for leveraging schema knowledge
w/o SE (edge semantics)	Moderate drop; relational information benefits structural matching
w/o USI (replace with simple clustering)	Moderate drop; LLM-driven semantic induction outperforms simple clustering

Performance gaps across all ablated components are more pronounced under the VAST (10%) low-resource setting, indicating that each component becomes increasingly critical when labeled data is scarce.

Key Findings¶

State-of-the-art across all three benchmarks, with statistical significance (\(p < 0.05\))
30% data matches full-data baselines: with 10% of COVID-19 data, CIRF surpasses LogiMDF by 2.8 points; with 20% of SEM16 data, it surpasses FOLAR by 0.6 points
LLM scalability: Upgrading from GPT-3.5 to GPT-4o improves VAST from 80.9 to 82.8 and COVID WA from 81.0 to 89.4
FOL-based knowledge outperforms natural language knowledge: CIRF and FOLAR (both using FOL) generally outperform KAI (which uses natural language)
Schema count has minimal impact on performance (variation < 1 point), suggesting that reasoning can be sufficiently abstracted by a small number of schemas
Performance remains stable as the top-\(g\) selection size varies from 2 to 16, indicating low sensitivity to this hyperparameter

Highlights & Insights¶

Successful transfer from cognitive science to NLP: Schema theory is formalized as FOL induction + graph kernel alignment, yielding both theoretical depth and practical effectiveness
The four-stage USI pipeline is elegantly designed: generation → interpretation abstraction → clustering → graph construction, progressively moving from instances to abstractions
Graph kernels over GNNs is a well-motivated design choice — GNN local message passing struggles to capture reusable high-order reasoning motifs
30% data matching full-data performance demonstrates that the induced schemas capture genuinely transferable reasoning structures rather than overfitting to the training distribution

Limitations & Future Work¶

Schema induction depends on LLM quality; scalability to noisy or very large corpora remains unverified
FOL representations may not capture implicit stance expressions such as rhetoric, irony, and metaphor
Applicability to multilingual and cross-cultural settings is unexplored
The computational cost of schema induction (requiring multiple LLM calls) is not quantitatively analyzed

vs. FOLAR: Both use FOL knowledge, but FOLAR operates at the instance-level FOL rule, while CIRF induces cross-target transferable schemas; CIRF surpasses FOLAR by 1.9 points on SEM16
vs. LogiMDF: LogiMDF also employs logical reasoning but operates at the predicate/word level without modeling relational structure; CIRF surpasses it by 3.7 points on COVID-19
vs. KAI: KAI augments with natural language knowledge; CIRF demonstrates that structured knowledge via FOL + schemas is more effective
vs. pure LLM prompting: GPT-3.5 direct prompting achieves only 65.1 on VAST, while CIRF reaches 80.9, showing that schema-guided reasoning far exceeds surface-level prompting

Rating¶

Novelty: ⭐⭐⭐⭐ Introducing cognitive schema theory into ZSSD is a pioneering cross-disciplinary integration; the USI + SEGKM framework design is original
Experimental Thoroughness: ⭐⭐⭐⭐⭐ Comprehensive comparison across three benchmarks, complete ablation, low-resource analysis, LLM scalability, hyperparameter sensitivity analysis, and case studies
Writing Quality: ⭐⭐⭐⭐ The derivation from cognitive motivation to formal methodology is clear and notation is consistent
Value: ⭐⭐⭐⭐ A practical method for low-resource ZSSD; the transferability of schemas offers a new paradigm for zero-shot NLP