🔗 Causal Inference¶

💬 ACL2025 · 10 paper notes

📌 Same area in other venues: 📷 CVPR2026 (4) · 🔬 ICLR2026 (64) · 💬 ACL2026 (7) · 🧪 ICML2026 (19) · 🤖 AAAI2026 (7) · 🧠 NeurIPS2025 (20)

🔥 Top topics: Reasoning ×3 · LLM ×2

Causal Graph based Event Reasoning using Semantic Relation Experts: A causal event graph generation framework involving multi-round collaborative discussion among four types of semantic relation experts (Temporal, Discourse, Conditional, Commonsense) is proposed. Under a zero-shot setting, it achieves competitive results compared to fine-tuned models on multiple downstream tasks, such as event prediction and event forecasting, while providing explainable causal event chains.
CausalRAG: Integrating Causal Graphs into Retrieval-Augmented Generation: Proposes CausalRAG, which integrates causal graphs into the retrieval process of RAG. It builds a text graph from documents and identifies causal relationships. During querying, it retrieves context through causal path discovery and causal summary generation, significantly improving context precision (92.86%) and retrieval recall in document question answering.
CoA-Reasoning: Explorations on Counterfactual Analysis in Physical Reasoning of LVLMs: This paper proposes the CoA-Reasoning framework to systematically evaluate and enhance the causal understanding of Large Vision-Language Models (LVLMs) in physical world reasoning by constructing counterfactual scenarios, revealing significant limitations of existing models in counterfactual physical reasoning.
Counterfactual-Consistency Prompting for Relative Temporal Understanding in Large Language Models: This paper proposes Counterfactual-Consistency Prompting, a method that addresses the inconsistency in temporal reasoning of large language models (LLMs) by generating counterfactual questions and imposing collective constraints, achieving significant improvements across multiple temporal understanding datasets.
Counterfactual Explanations for Aspect-Based Sentiment Analysis: This paper proposes a method for generating counterfactual explanations for aspect-based sentiment analysis (ABSA). By finding the minimal text edits that flip the sentiment polarity of a specific aspect, it provides intuitive causal explanations for ABSA model predictions.
FitCF: A Framework for Automatic Feature Importance-guided Counterfactual Example Generation: This paper proposes the FitCF framework, which leverages BERT-based feature attribution methods (such as LIME/IG/SHAP) to extract important words to guide Large Language Models (LLMs) in generating counterfactual examples under a zero-shot setting (ZeroCF). After filtering through label-flip validation, these examples are utilized as few-shot demonstrations. This approach consistently outperforms three baseline methods (Polyjuice, BAE, FIZLE) on news classification and sentiment analysis tasks.
IRIS: An Iterative and Integrated Framework for Verifiable Causal Discovery: The IRIS framework is proposed. Requiring only a set of initial variable names as input, it automatically retrieves documents, extracts variable values to construct structured data, and builds causal graphs via hybrid causal discovery (GES statistical algorithm + LLM causal verification). Additionally, it iteratively expands the variable set using a missing variable proposal component, relaxing the acyclicity and causal sufficiency assumptions of traditional methods. IRIS comprehensively outperforms 0-shot, CoT, and RAG baselines in F1 score across 6 datasets: Cancer, Diabetes, Obesity, ADNI, and Insurance.
Leveraging Variation Theory in Counterfactual Data Augmentation for Optimized Active Learning: This paper introduces Variation Theory into the Counterfactual Data Augmentation (CDA) framework, generating counterfactual samples using LLMs while preserving neuro-symbolic patterns, and incorporating a three-stage filtering pipeline to select high-quality data. This approach optimizes few-shot text classification in active learning, achieving significant F1 improvements across multiple datasets.
On the Reliability of Large Language Models for Causal Discovery: Using pre-training corpora accessible via open-source LLMs (OLMo, BLOOM), this study empirically validates the "Causal Parrot" hypothesis—that an LLM's ability to identify causal relationships is highly correlated with the frequency of that relationship in the pre-training data (Spearman \(r=0.9\)), and that the presence of erroneous causal relationships and changes in context significantly affect prediction reliability.
Reasoning is All You Need for Video Generalization: A Counterfactual Benchmark with Sub-question Evaluation: This paper proposes COVER (COunterfactual VidEo Reasoning), a multi-dimensional video counterfactual reasoning benchmark. It classifies evaluation tasks into four quadrants comprising 13 categories across two dimensions (abstract-concrete and perception-cognition). By decomposing complex questions into sub-questions (necessary conditions), the benchmark reveals that sub-question accuracy is strongly correlated with counterfactual reasoning ability, and enhancing reasoning capacity is the key to improving robustness in video understanding.