Skip to content

Induce, Align, Predict: Zero-Shot Stance Detection via Cognitive Inductive Reasoning

Conference: AAAI 2026 arXiv: 2506.13470 Code: None Area: Interpretability Keywords: zero-shot stance detection, cognitive schema, first-order logic, graph kernel, low-resource

TL;DR

This paper proposes the CIRF framework, which abstracts transferable reasoning patterns from LLM-generated first-order logic via unsupervised schema induction (USI), and performs explainable zero-shot stance reasoning through structural alignment using a schema-enhanced graph kernel model (SEGKM). The method achieves state-of-the-art performance on three benchmarks while requiring only 30% of labeled data.

Background & Motivation

Background: Zero-shot stance detection (ZSSD) requires inferring the stance of text toward targets unseen during training, which is critical for analyzing rapidly emerging polarized social media topics.

Limitations of Prior Work: - LLM zero-shot prompting underperforms on complex reasoning with limited generalization (GPT-3.5 achieves only 69.8 F1 on SEM16) - LLM-augmented fine-tuning methods (KAI, FOLAR, etc.) still require substantial labeled data and remain at instance-level pattern matching - Both paradigms lack explainability and cross-target reasoning generalization

Key Challenge: Stance detection requires abstract reasoning beyond surface-level lexical matching (e.g., "increasing health risks" and "undermining economic stability" both instantiate the reasoning pattern "negative consequence → opposition"), yet existing methods either perform surface-level matching or rely heavily on annotations.

Key Insight: Schema theory in cognitive science — humans induce generalizable reasoning patterns (schemas) from concrete experiences and apply them to new contexts. This cognitive capability is formalized as unsupervised induction of first-order logic patterns with graph kernel alignment.

Method

Overall Architecture

CIRF consists of two core modules: (1) USI (Unsupervised Schema Induction): LLM generates FOL reasoning → interpretation abstraction → clustering into schema graphs; (2) SEGKM (Schema-Enhanced Graph Kernel Model): constructs FOL graphs from input → subgraph kernel matching against schema templates → hierarchical graph representation → stance prediction.

Key Designs

1. Unsupervised Schema Induction (USI)

  • Function: Unsupervisedly induces abstract, cross-target transferable reasoning patterns from raw text
  • Design Motivation: Instance-level FOL rules cannot generalize across domains; higher-level reasoning abstraction is needed
  • Mechanism (four-stage pipeline):
    • FOL Reasoning Generation: For each sentence-target pair, the LLM is prompted to generate a first-order logic reasoning chain
    • FOL Interpretation and Abstraction: The LLM is prompted to analyze the internal logic of FOL, generate logically equivalent but structurally diverse variants, and then summarize them into generalized templates. Example: ∀x, (is_robot(x) → (helps_humans(x) → must_be_safe(x))) is abstracted to ∀x, ((is_target(x) ∧ meets_condition(x)) → entails_consequence(x))
    • Schema Clustering and Hierarchical Abstraction: FOL templates are clustered by semantic and reasoning pattern similarity; large clusters are processed via a hierarchical strategy (sub-cluster splitting → intermediate chaining → merging into schema templates)
    • Schema Graph Construction: Induced schemas serve as nodes in a multi-relational graph, with edges representing logical relations such as causality, contrast, and entailment

2. Schema-Enhanced Graph Kernel Model (SEGKM)

  • Function: Leverages schema knowledge to enhance the representation of input reasoning structures for explainable zero-shot inference
  • Design Motivation: Standard GNNs rely on local message passing and struggle to capture reusable high-order reasoning motifs; graph kernels enable better generalization through explicit structural matching
  • Mechanism:
    • FOL Graph Construction: For each input pair \((x, q)\), a FOL reasoning chain is generated and parsed into a FOL graph \(G_f=(V_f, E_f)\), where nodes are predicates and edges are logical relations
    • Schema Subgraph Filters: \(k\)-hop subgraphs centered at each node are extracted from schema graph \(G^{(j)}\) to form a filter pool \(\mathcal{H} = \bigcup_j H^{(j)}\)
    • Relation-Aware Node Embedding: Nodes and edges are initialized with BERT embeddings; edge semantics are fused via relational projection: \(x' = \text{ReLU}(x + \text{Proj}(e))\)
    • Deep Graph Kernel Response: The \(p\)-step random walk kernel between an input subgraph and a schema filter is computed as: \(\phi_{1,i}(v) = K_p(G_v^f, H_i^{(j)}) = \mathbf{s}^\top W A^{\times p} \mathbf{s}\), where \(\mathbf{s} = \text{vec}(X_{G_v^f}' \cdot (X_{H_i^{(j)}}')^\top)\)
    • Schema Graph-Level Selection: Kernel responses across all nodes are aggregated to select the top-\(g\) schema graphs: \(S^{(j)} = \sum_{v \in V_f} \frac{1}{|H^{(j)}|} \sum_{H_i^{(j)} \in H^{(j)}} \phi_{1,i}(v)\)
    • Hierarchical Graph Representation: Multi-layer stacking of kernel feature extraction yields the final graph representation: \(\Phi(G_f) = \text{Concat}(\sum_{v \in G_f} \phi_l(v) \mid l=0,1,...,L)\)

3. Stance Prediction

The final graph representation is fed into a fully connected ReLU layer for three-class classification (Favor/Against/None), trained end-to-end with cross-entropy loss.

Loss & Training

  • Loss function: Cross-entropy loss
  • Optimizer: AdamW, batch size 32, learning rate \(5 \times 10^{-4}\)
  • Early stopping (patience = 10), up to 20 epochs, validated every 0.2 epochs
  • Schema induction is fully unsupervised; SEGKM is trained on source targets
  • Hardware: Single 40GB A100 GPU
  • Default LLM: GPT-3.5; GPT-4o and DeepSeek-v3 are also evaluated

Key Experimental Results

Main Results

Zero-shot stance detection results (Macro-F1) on SEM16, VAST, and COVID-19:

Method Type SEM16 HC SEM16 FM SEM16 LA VAST All COVID AF COVID WA
JointCL BERT fine-tune 54.4 54.0 50.0 71.2 57.6 63.1
GPT-3.5 LLM prompting 78.9 68.3 62.3 65.1 69.2 57.8
COLA LLM prompting 81.7 63.4 71.0 73.0 65.7 73.9
KAI LLM-augmented 76.4 73.7 69.4 76.3 - -
LCDA LLM-augmented 79.8 70.0 69.4 80.3 - -
FOLAR LLM-augmented 81.9 71.2 69.9 77.2 69.5 73.1
LogiMDF LLM-augmented 75.1 67.9 68.0 76.7 70.4 75.4
CIRF Schema 80.1 74.7 73.9 80.9 74.1 81.0
CIRF (GPT-4o) Schema 83.2 80.4 78.2 82.8 84.9 89.4

Average Macro-F1: 76.2 on SEM16 (+1.9 over FOLAR), 80.9 on VAST (+0.6 over LCDA), +3.7 over LogiMDF on COVID-19.

Ablation Study

Variant Effect
w/o Schema Largest performance drop; removing cognitive schemas severely degrades cross-target generalization
w/o SEGKM Large performance drop; graph kernel alignment is critical for leveraging schema knowledge
w/o SE (edge semantics) Moderate drop; relational information benefits structural matching
w/o USI (replace with simple clustering) Moderate drop; LLM-driven semantic induction outperforms simple clustering

Performance gaps across all ablated components are more pronounced under the VAST (10%) low-resource setting, indicating that each component becomes increasingly critical when labeled data is scarce.

Key Findings

  • State-of-the-art across all three benchmarks, with statistical significance (\(p < 0.05\))
  • 30% data matches full-data baselines: with 10% of COVID-19 data, CIRF surpasses LogiMDF by 2.8 points; with 20% of SEM16 data, it surpasses FOLAR by 0.6 points
  • LLM scalability: Upgrading from GPT-3.5 to GPT-4o improves VAST from 80.9 to 82.8 and COVID WA from 81.0 to 89.4
  • FOL-based knowledge outperforms natural language knowledge: CIRF and FOLAR (both using FOL) generally outperform KAI (which uses natural language)
  • Schema count has minimal impact on performance (variation < 1 point), suggesting that reasoning can be sufficiently abstracted by a small number of schemas
  • Performance remains stable as the top-\(g\) selection size varies from 2 to 16, indicating low sensitivity to this hyperparameter

Highlights & Insights

  • Successful transfer from cognitive science to NLP: Schema theory is formalized as FOL induction + graph kernel alignment, yielding both theoretical depth and practical effectiveness
  • The four-stage USI pipeline is elegantly designed: generation → interpretation abstraction → clustering → graph construction, progressively moving from instances to abstractions
  • Graph kernels over GNNs is a well-motivated design choice — GNN local message passing struggles to capture reusable high-order reasoning motifs
  • 30% data matching full-data performance demonstrates that the induced schemas capture genuinely transferable reasoning structures rather than overfitting to the training distribution

Limitations & Future Work

  • Schema induction depends on LLM quality; scalability to noisy or very large corpora remains unverified
  • FOL representations may not capture implicit stance expressions such as rhetoric, irony, and metaphor
  • Applicability to multilingual and cross-cultural settings is unexplored
  • The computational cost of schema induction (requiring multiple LLM calls) is not quantitatively analyzed
  • vs. FOLAR: Both use FOL knowledge, but FOLAR operates at the instance-level FOL rule, while CIRF induces cross-target transferable schemas; CIRF surpasses FOLAR by 1.9 points on SEM16
  • vs. LogiMDF: LogiMDF also employs logical reasoning but operates at the predicate/word level without modeling relational structure; CIRF surpasses it by 3.7 points on COVID-19
  • vs. KAI: KAI augments with natural language knowledge; CIRF demonstrates that structured knowledge via FOL + schemas is more effective
  • vs. pure LLM prompting: GPT-3.5 direct prompting achieves only 65.1 on VAST, while CIRF reaches 80.9, showing that schema-guided reasoning far exceeds surface-level prompting

Rating

  • Novelty: ⭐⭐⭐⭐ Introducing cognitive schema theory into ZSSD is a pioneering cross-disciplinary integration; the USI + SEGKM framework design is original
  • Experimental Thoroughness: ⭐⭐⭐⭐⭐ Comprehensive comparison across three benchmarks, complete ablation, low-resource analysis, LLM scalability, hyperparameter sensitivity analysis, and case studies
  • Writing Quality: ⭐⭐⭐⭐ The derivation from cognitive motivation to formal methodology is clear and notation is consistent
  • Value: ⭐⭐⭐⭐ A practical method for low-resource ZSSD; the transferability of schemas offers a new paradigm for zero-shot NLP