🔗 Causal Inference¶

🧪 ICML2025 · 17 paper notes

📌 Same area in other venues: 📷 CVPR2026 (4) · 🔬 ICLR2026 (64) · 💬 ACL2026 (7) · 🧪 ICML2026 (19) · 🤖 AAAI2026 (7) · 🧠 NeurIPS2025 (20)

Causal Abstraction Inference under Lossy Representations: This paper proposes the Projected Abstraction framework, breaking the reliance of existing causal abstraction theory on the "Abstract Invariance Condition (AIC)." This enables mathematically consistent causal inference under lossy/dimension-reduced representations and provides identifiability criteria at the graphical model level.
Causal Effect Identification in lvLiNGAM from Higher-Order Cumulants: In the Linear Non-Gaussian Acyclic Model with latent confounding (lvLiNGAM), this paper identifies causal effects using higher-order cumulants (instead of only the covariance matrix). It addresses two challenging settings: (1) a single proxy variable that may affect the treatment; and (2) the underdetermined instrumental variable (IV) problem where the number of IVs is less than the number of treatments. Identifiability is proved and consistent estimators are provided for both cases.
Causal Evidence for the Primordiality of Colors in Trans-Neptunian Objects: Using a model-agnostic causal discovery method (the FCI algorithm), this paper demonstrates with 98.7% confidence that the color of Trans-Neptunian Objects (TNOs) is the root cause of their orbital inclination distribution. This provides strong support for the "primordial" hypothesis of TNO colors—implying that color reflects the formation location rather than post-formation collisional evolution.
Classifier Reconstruction Through Counterfactual-Aware Wasserstein Prototypes: This paper proposes using Wasserstein barycenters to fuse original and counterfactual samples into class prototypes, enabling high-fidelity reconstruction of target binary classifiers under limited query budgets and effectively mitigating the decision boundary shift problem caused by the naive use of counterfactual samples.
Doubly Protected Estimation for Survival Outcomes Utilizing External Controls for Randomized Clinical Trials: Proposing a doubly protected estimation framework for survival outcomes that corrects covariate shift via density ratio weighting and detects outcome drift via DR-Learner to selectively borrow comparable external controls, achieving robustness to external data heterogeneity while guaranteeing consistency and efficiency gains.
E-LDA: Toward Interpretable LDA Topic Models with Strong Guarantees in Logarithmic Parallel Time: Proposes E-LDA (Exemplar-LDA), which reformulates the MAP topic-word assignment problem of LDA as a monotone submodular function maximization. For the first time, a practical algorithm with a \(1-1/e\) approximation guarantee is achieved, which converges in logarithmic parallel time while ensuring that each learned topic possesses formal keyword-based interpretability.
Estimating Causal Effects in Gaussian Linear SCMs with Finite Data: This work proposes the Centralized Gaussian Linear SCM (CGL-SCM), which significantly reduces the parameter space by standardizing exogenous variables to \(\mathcal{N}(0,1)\), and designs an EM-based estimation algorithm to accurately recover identifiable causal effects under finite observational data.
Exogenous Isomorphism for Counterfactual Identifiability: This paper proposes the concept of Exogenous Isomorphism (EI), proving that \(\sim_{\mathrm{EI}}\)-identifiability implies \(\sim_{\mathcal{L}_3}\)-identifiability (complete counterfactual layer identifiability). It provides sufficient conditions for achieving EI in two special classes of models: Bijective SCMs (BSCMs) and Triangular Monotonic SCMs (TM-SCMs), thereby unifying and generalizing existing counterfactual identifiability theories.
Internal Causal Mechanisms Robustly Predict Language Model Out-of-Distribution Behaviors: Using identified internal causal mechanisms in LLMs to predict model output correctness on out-of-distribution (OOD) inputs, this work proposes two methods—counterfactual simulation and value probing—achieving an average AUC-ROC improvement of 13.84% over existing baselines in OOD settings.
Isolated Causal Effects of Natural Language: Proposes a formal estimation framework for the "Isolated Causal Effect," which isolates the causal effect of focal language attributes from correlated non-focal language using a doubly robust estimator and omitted variable bias (OVB) sensitivity analysis.
Latent Variable Causal Discovery under Selection Bias: Extends rank constraints to selection bias scenarios for the first time, proving that under linear selection mechanisms, the rank of the biased covariance matrix still preserves information about the causal structure and selection mechanism. It proposes a generalized t-separation graphical criterion, proves identifiability in one-factor models, and validates effectiveness on both synthetic and real-world datasets (World Value Survey, Big Five Personality).
Learning Time-Aware Causal Representation for Model Generalization in Evolving Domains: This paper proposes a time-aware structural causal model (time-aware SCM) and the SYNC method. By simultaneously learning static and dynamic causal representations and modeling causal mechanism drift, it effectively eliminates spurious correlations in evolving domain generalization (EDG) tasks, achieving superior temporal generalization performance.
MPF: Aligning and Debiasing Language Models post Deployment via Multi Perspective Fusion: Proposes Multiperspective Fusion (MPF), a tuning-free, post-deployment alignment framework that guides LLMs to generate responses aligned with human baselines by decomposing baseline sentiment distributions into interpretable perspective components, thereby effectively mitigating model bias.
Position: Causal Machine Learning Requires Rigorous Synthetic Experiments for Broader Adoption: This position paper argues that synthetic experiments are indispensable for the rigorous evaluation of causal machine learning (Causal ML) methods. However, current synthetic experimental designs suffer from bias and insufficient complexity. Adhering to a set of principles is required to improve experimental quality, thereby facilitating the broader adoption of Causal ML.
RATE: Causal Explainability of Reward Models with Imperfect Counterfactuals: Proposes RATE (Rewrite-based Attribute Treatment Estimator), which uses a "double rewriting" strategy to eliminate bias introduced by imperfect LLM counterfactual rewrites, enabling accurate estimation of the causal effects of high-level attributes on reward model scores.
RE-IMAGINE: Symbolic Benchmark Synthesis for Reasoning Evaluation: Inspired by Pearl's Ladder of Causation, this work proposes the RE-IMAGINE framework. By translating questions into intermediate symbolic representations (code) and executing multi-level mutations on the computation graph, the framework generates benchmark variants that cannot be solved by memorization, systematically evaluating the genuine reasoning capabilities of LLMs.
Transformer-Based Spatial-Temporal Counterfactual Outcomes Estimation: Proposes a Transformer-based spatial-temporal counterfactual outcomes estimation framework that utilizes CNNs to compute high-dimensional propensity scores and Transformers to estimate intensity functions, outperforming traditional causal inference methods on both synthetic and real-world data.