📂 Others¶

🤖 AAAI2026 · 126 paper notes

A Fast Heuristic Search Approach for Energy-Optimal Profile Routing for Electric Vehicles: This paper proposes Pr-A, a label-setting method based on multi-objective A search for efficiently solving energy-optimal profile routing for electric vehicles (EVs) when the initial state of charge (SoC) is unknown. By using profile dominance pruning, the method avoids the complex profile merge operations required by traditional approaches, achieving performance close to standard A* with known initial SoC on large-scale road networks.
A Graph-Theoretical Perspective on Law Design for Multiagent Systems: This paper studies the law design problem in multiagent systems from a graph-theoretical perspective, reducing the minimization of useful laws and gap-free laws to the vertex cover problem on hypergraphs, proving NP-hardness, and providing approximation algorithms.
A New Strategy for Verifying Reach-Avoid Specifications in Neural Feedback Systems: This paper proposes FaBRe (Forward and Backward Reachability), a unified framework that, for the first time, develops both over- and under-approximation algorithms for backward reachable sets of ReLU neural network controllers (GSS/ICH/LEB), and integrates them with forward reachability analysis to construct a unified reach-avoid verification framework, aiming to overcome the scalability bottleneck of purely forward analysis.
A Phase Transition for Opinion Dynamics with Competing Biases: This paper models the competition between two opposing forces — external subversive bias and individual stubbornness — on binary opinion spreading over directed random graphs. It proves that the system exhibits a sharp phase transition: when the bias exceeds a critical threshold \(p_c\), the population rapidly reaches a new consensus; below the threshold, the system remains in a long-lived metastable polarized state. The critical point is determined solely by two simple statistics of the degree sequence.
A Switching Framework for Online Interval Scheduling with Predictions: For the irrevocable online interval scheduling problem, this paper proposes the SemiTrust-and-Switch framework and the SmoothMerge randomized algorithm. By switching between or blending a prediction-trusting strategy and a classical greedy algorithm, the approach achieves near-optimal performance when predictions are accurate (consistency) and degrades gracefully when predictions are erroneous (robustness and smoothness). Tightness of the framework on specific instances is also established.
A Topological Rewriting of Tarski's Mereogeometry: This work extends the λ-MM library within the Coq theorem prover to recast Tarski's solid geometry—grounded in Leśniewski's mereology—into a fully formalized system with a complete topological structure. It proves that mereological classes correspond to regular open sets, satisfy Kuratowski's interior axioms, and exhibit the Hausdorff (T2) separation property, thereby providing a unified mereological–geometric–topological theoretical framework for qualitative spatial reasoning.
UniShape: A Unified Shape-Aware Foundation Model for Time Series Classification: This paper proposes UniShape — the first shape-aware foundation model for time series classification (TSC). It captures class-discriminative temporal patterns via a shape-aware adapter that adaptively aggregates multi-scale subsequences (shapes), and jointly learns transferable shapelet representations at both instance and shape levels through a prototype-based pretraining module. Pretrained on 1.89M samples, UniShape achieves an average accuracy of 0.8708 across 128 UCR datasets, surpassing all baselines.
Agent-SAMA: State-Aware Mobile Assistant: This paper proposes Agent-SAMA, which for the first time introduces a finite state machine (FSM) into mobile GUI agents, modeling UI screens as states and user actions as transitions. Four specialized agents collaborate to achieve state-aware task planning, execution verification, and error recovery, improving success rate by up to 12% and recovery rate by 13.8% on cross-app benchmarks.
Align When They Want, Complement When They Need! Human-Centered Ensembles for Adaptive Human-AI Collaboration: This paper reveals a fundamental trade-off between complementarity and alignment in human-AI collaboration—no single model can simultaneously optimize both objectives. It proposes an adaptive AI ensemble framework that dynamically switches between an alignment model and a complementarity model via a Rational Routing Shortcut (RRS) mechanism, achieving up to 9% improvement in team accuracy over standard AI.
AMS-IO-Bench and AMS-IO-Agent: Benchmarking and Structured Reasoning for Analog and Mixed-Signal Integrated Circuit Input/Output Design: This paper proposes AMS-IO-Agent, a domain-specific LLM-based agent that transforms natural language design intent into production-ready analog and mixed-signal IC I/O ring designs via a structured Intent Graph and a domain knowledge base. It also introduces AMS-IO-Bench, the first benchmark for AMS I/O ring automation. The agent-generated I/O ring is validated in a 28nm CMOS tape-out and demonstrated to be directly applicable to real chip fabrication.
An Epistemic Perspective on Agent Awareness: This paper is the first to treat agent awareness as a form of knowledge, distinguishing two awareness modalities — de re (concerning physical objects) and de dicto (concerning concepts/descriptions) — and proposes a sound and complete logical system grounded in 2D semantics to characterize the interaction between these two modalities and the standard "factual knowledge" modality.
Approximation Algorithm for Constrained k-Center Clustering: A Local Search Approach: This paper studies the k-center clustering problem with instance-level cannot-link (CL) and must-link (ML) constraints. It proposes a local search framework based on a dominating matching set (DMS) reduction, and, under the disjoint CL sets condition, is the first to achieve the optimal approximation ratio of 2 via local search—resolving an open problem in the field.
Area-Optimal Control Strategies for Heterogeneous Multi-Agent Pursuit: This paper studies pursuit-evasion games with heterogeneous speeds involving multiple pursuers and a single evader. The evader's safe reachable set is defined as the intersection of Apollonius circles for all pursuer–evader pairs. The capture strategy is modeled as a zero-sum game in which pursuers minimize and the evader maximizes the area of this intersection. Closed-form instantaneous optimal heading control laws are derived, and simulations verify that pursuers can systematically shrink the safe region to guarantee capture.
Automated Reproducibility Has a Problem Statement Problem: This paper proposes a formalized problem definition of reproducibility grounded in the scientific method, representing empirical AI research as a hypothesis–experiment–interpretation graph structure. An LLM is used to automatically extract this structure from 20 papers, and the extracted results are validated through review by the original authors.
Autonomous Concept Drift Threshold Determination: This paper proves that no fixed threshold can be optimal across all scenarios and that dynamic thresholds strictly dominate static ones. It proposes the DTD algorithm, which initiates a three-model comparison phase upon drift detection signal trigger and adaptively adjusts the detection threshold based on candidate model performance.
Bandit Learning in Housing Markets: This paper is the first to introduce the multi-armed bandit (MAB) framework into housing markets (one-sided matching markets). It defines regret based on the core solution concept, proposes a decentralized ETC algorithm and a centralized UCB algorithm, proves a decentralized regret upper bound of \(\mathcal{O}(N\log T / \Delta_{\min}^2)\) along with a matching lower bound, and establishes order-optimality.
Bayesian Network Structural Consensus via Greedy Min-Cut Analysis: This paper proposes the MCBNC algorithm, which quantifies the structural support of each edge via min-cut analysis and embeds this scoring into the backward phase of Greedy Equivalence Search (GES) to iteratively prune redundant edges from a fused Bayesian network. The method produces sparser and more accurate consensus structures without accessing any data, making it well-suited for federated learning scenarios.
Beyond World Models: Rethinking Understanding in AI Models: Through three case studies drawn from the philosophy of science — a domino computer, a mathematical proof, and Bohr's atomic theory — this paper argues that the world model framework is insufficient to characterize human-level "understanding," demonstrating that tracking states and state transitions alone cannot capture the abstract reasoning, motivational insight, and problem-context awareness that understanding requires.
Bilevel MCTS for Amortized O(1) Node Selection in Classical Planning: This paper proposes Bilevel MCTS, which launches a depth-proportional budgeted best-first search at the leaf node selected by MCTS, reducing the amortized node-selection complexity from \(O(\log N)\) to \(O(1)\). Complemented by Tree Collapsing to reduce the number of action-selection steps, these components are integrated into the Nεbula planner, which solves 192.2/230.6 problems on IPC2018/2023 benchmarks (5min/30min), outperforming all prior SOTA planners including LAMA, DecStar, NOLAN, and SM-Type-LAMA.
Bipartite Mode Matching for Vision Training Set Search from a Hierarchical Data Server: This paper proposes a hierarchical data server combined with a Bipartite Mode Matching (BMM) framework. It organizes large-scale source data via multi-granularity hierarchical clustering and employs the Hungarian algorithm to perform one-to-one matching between semantic modes of the source and target domains, thereby retrieving a training set that minimizes the distributional gap to the target domain. The approach significantly outperforms existing training set search methods on person re-identification and object detection tasks.
Boosting Adversarial Transferability via Ensemble Non-Attention: This paper proposes NAMEA (Non-Attention Meta Ensemble Attack), which for the first time exploits the non-attention areas of ensemble models to integrate transferable information from both CNNs and ViTs, and combines meta-learning gradient optimization to achieve an average improvement of 15.0% and 9.6% over the state-of-the-art methods AdaEA and SMER, respectively, on cross-architecture adversarial transferability.
Bridging the Skills Gap: A Course Model for Modern Generative AI Education: This paper proposes a generative AI application course model for undergraduate and graduate computer science students. A mixed-methods survey demonstrates that the course is effective in bridging the generative AI skills gap between industry and academia, with students broadly rating it as valuable and impactful.
Cash Flow Underwriting with Bank Transaction Data: Advancing MSME Financial Inclusion in Malaysia: This paper proposes an end-to-end cash flow underwriting workflow based on bank transaction data and constructs the first Malaysian MSME bank statement dataset (611 loan records). It demonstrates that features derived from bank transactions improve a logistic regression model's AUROC from 0.672 to 0.850 compared to traditional application information alone, significantly enhancing credit assessment capability for MSMEs lacking credit histories.
CASL: Curvature-Augmented Self-supervised Learning for 3D Anomaly Detection: This work identifies point cloud curvature as a powerful cue for anomaly detection and proposes CASL, a curvature-augmented self-supervised learning framework. By guiding coordinate reconstruction with multi-scale curvature prompts, CASL learns generalizable 3D representations without any anomaly-detection-specific mechanisms, achieving a 5.6% O-AUROC improvement over the previous state of the art on Real3D-AD.
CAT-Net: A Cross-Attention Tone Network for Cross-Subject EEG-EMG Fusion Tone Decoding: This paper proposes CAT-Net (Cross-Attention Tone Network), which achieves Mandarin four-tone classification using only 20 EEG channels and 5 EMG channels via spatial-temporal feature extraction branches, a cross-attention fusion mechanism, and domain adversarial training. The model achieves 87.83%/88.08% accuracy under voiced/silent speech conditions and 83.27%/85.10% under cross-subject evaluation, outperforming all 8 baseline methods.
CellStream: Dynamical Optimal Transport Informed Embeddings for Reconstructing Cellular Trajectories from Snapshots Data: This paper proposes CellStream, a deep learning framework that jointly learns an autoencoder and unbalanced dynamical optimal transport (OT) to simultaneously obtain low-dimensional embeddings and continuous cellular dynamics from discrete-time single-cell snapshot data, achieving significant improvements over existing methods in temporal consistency and velocity consistency.
Center-Outward q-Dominance: A Sample-Computable Proxy for Strong Stochastic Dominance in Multi-Objective Optimisation: Building on the center-outward distribution function from optimal transport theory, this paper proposes the q-dominance relation as a computable approximation of strong first-order stochastic dominance (strong FSD). It proves that q-dominance over the full quantile range implies strong FSD, derives explicit sample-size thresholds for Type I error control, and validates practical utility in hyperparameter tuning ranking and noisy multi-objective optimisation.
Certified Branch-and-Bound MaxSAT Solving (Extended Version): This paper introduces VeriPB-based certification for Branch-and-Bound MaxSAT solvers, covering two core techniques: look-ahead bounding methods and multi-valued decision diagram (MDD) encodings. Experiments on the MaxCDCL solver demonstrate a median proof logging overhead of only 19%, filling the last remaining gap in certified MaxSAT solving paradigms.
Certified but Fooled! Breaking Certified Defences with Ghost Certificates: This paper proposes GhostCert, a salient-region-based adversarial attack that misleads classifiers while maintaining imperceptible perturbations and forging large-radius robustness certificates (ghost certificates). On ImageNet, GhostCert achieves substantially higher attack success rates and larger spoofed certification radii than Shadow Attack against state-of-the-art certified defences including DensePure.
Clinician-in-the-Loop Smart Home System to Detect Urinary Tract Infection Flare-Ups via Uncertainty-Aware Decision Support: This paper proposes a clinician-in-the-loop smart home system that extracts behavioral markers from ambient sensor data and introduces a novel Conformal Calibrated Interval (CCI) method to quantify predictive uncertainty, enabling reliable detection of urinary tract infection (UTI) flare-ups in older adults and supporting an "abstain when uncertain" decision paradigm.
CompTrack: Information Bottleneck-Guided Low-Rank Dynamic Token Compression for Point Cloud Tracking: This paper proposes CompTrack—the first 3D single object tracking framework that simultaneously addresses both spatial redundancy and information redundancy in LiDAR point clouds. A Spatial Foreground Predictor (SFP) filters background noise via information entropy, while an Information Bottleneck-guided Dynamic Token Compression (IB-DTC) module estimates effective rank via online SVD and compresses foreground tokens into compact proxy tokens. CompTrack achieves state-of-the-art performance on nuScenes and Waymo while running in real time at 90 FPS.
Controllable Financial Market Generation with Diffusion Guided Meta Agent: This paper proposes the Diffusion Guided Meta Agent (DigMA), which formalizes controllable financial market generation as a conditional generation task. A conditional diffusion model captures the dynamics of market states (time-varying distribution parameters of mid-price returns and order arrival rates), while a Meta Agent with financial economics priors generates order flow under the guidance of the controller. DigMA outperforms existing methods in both controllability and generation fidelity.
Cost-Free Neutrality for the River Method: For the Parallel Universes Tiebreaking (PUT) problem applied to the River voting method, this paper proves that the winner set can be computed in polynomial time (in contrast to the NP-completeness of Ranked Pairs), and proposes the Fused-Universe (FUN) algorithm, which simulates all possible tiebreaking orders in a single pass and provides a constructive certificate for each winner.
Data Complexity of Querying Description Logic Knowledge Bases under Cost-Based Semantics: This paper systematically investigates the data complexity of query answering over weighted description logic knowledge bases under cost-based semantics. It establishes that optimal-cost semantics is decidable within \(\Delta_2^p\), and delivers a surprising positive result: for DL-Lite\(_{\text{bool}}^{\mathcal{H}}\) ontologies with a fixed cost bound, both certain answers to instance queries and possible answers to conjunctive queries admit first-order rewritings, achieving the lowest possible data complexity (AC\(^0\)).
Deadline-Aware, Energy-Efficient Control of Domestic Immersion Hot Water Heaters: This paper proposes a deadline-aware energy-efficient control method for domestic hot water heaters. Using a Gymnasium-based simulation environment, it benchmarks a bang-bang baseline, an MCTS planner, and a PPO policy, demonstrating that PPO achieves up to 69% energy savings under identical physical conditions.
Decomposition and Preprocessing of Ternary Constraint Networks: This paper proposes a complete theoretical framework for formally decomposing arbitrary discrete constraint networks into ternary constraint networks (TCNs), and reduces the variable/constraint blowup introduced by decomposition from a median of 8x/6x to 4.8x/4.3x via seven preprocessing techniques (propagation, algebraic simplification, common subexpression elimination, etc.), providing a regularized data layout for efficient constraint solving on GPU hardware.
DECOR: Deep Embedding Clustering with Orientation Robustness: This paper proposes DECOR, a framework that achieves orientation-robust clustering of wafer map defect patterns via a rotation-invariant equivariant convolutional autoencoder (RCAE), non-parametric clustering (DeepDPM), and an ensemble anomaly detection mechanism.
DeepRWCap: Neural-Guided Random-Walk Capacitance Solver for IC Design: This paper proposes DeepRWCap, a machine-learning-guided random-walk capacitance solver that accelerates multi-dielectric capacitance extraction in IC design via a two-stage neural network architecture for transition kernel prediction, achieving an average error of 1.24% and 23% speedup across 10 industrial test cases.
Depth-Synergized Mamba Meets Memory Experts for All-Day Image Reflection Separation: This paper proposes DMDNet, which employs a depth-aware scanning strategy (DAScan) to guide Mamba toward salient structures, incorporates a depth-synergized state space model (DS-SSM) to suppress ambiguous feature propagation, and introduces a memory expert compensation module (MECM) to leverage cross-image historical knowledge, achieving all-day (daytime + nighttime) image reflection separation.
Description Logics with Two Types of Definite Descriptions: Complexity, Expressiveness, and Automated Deduction: This paper introduces two extensions of the description logic ALC with definite descriptions — local definite descriptions \(\{ι C\}\) and global definite descriptions \(ι C.D\) — and proves that the satisfiability problems of all three extended logics are ExpTime-complete. Furthermore, it establishes that global definite descriptions are strictly more expressive than local ones (\(\mathcal{ALC}\iota_L < \mathcal{ALC}\iota_G = \mathcal{ALC}\iota\)), and provides tableau calculi decision procedures along with experimental evaluation.
Designing Incident Reporting Systems for Harms from General-Purpose AI: Through a literature review and case studies of nine safety-critical industries (nuclear energy, aviation, healthcare, etc.), this paper proposes a seven-dimensional institutional design framework for AI incident reporting systems, providing systematic guidance for policy design of general-purpose AI incident reporting in the United States.
DeToNATION: Decoupled Torch Network-Aware Training on Interlinked Online Nodes: This paper proposes FlexDeMo — a hybrid sharding training strategy that integrates Fully Sharded Data Parallelism (FSDP) with decoupled momentum optimization. It applies FSDP sharding within nodes and synchronizes only the fast-moving momentum components across nodes, achieving loss convergence comparable to full-synchronization AdamW while substantially accelerating training.
Deviation Dynamics in Cardinal Hedonic Games: This paper establishes meta-theorems for deviation dynamics in cardinal hedonic games, showing that the computational complexity of determining whether deviation dynamics may or must converge can be derived directly from instances in which no stable outcome exists. The paper further proposes methods for finding individually rational and contractually individually stable partitions via deviation dynamics in additively separable hedonic games.
DFDT: Dynamic Fast Decision Tree for IoT Data Stream Mining on Edge Devices: This paper proposes DFDT (Dynamic Fast Decision Tree), a memory-constrained data stream mining algorithm for IoT edge devices. Through three coordinated mechanisms — activity-aware pre-pruning, dynamic grace period, and adaptive tie threshold — DFDT achieves an optimal trade-off among accuracy, memory usage, and runtime.
DiffMM: Efficient Method for Accurate Noisy and Sparse Trajectory Map Matching via One Step Diffusion: This paper proposes DiffMM, the first approach to introduce diffusion models into map matching. By combining a road-segment-aware trajectory encoder with a one-step Shortcut diffusion process, DiffMM achieves simultaneous improvements in accuracy and efficiency on sparse trajectories and complex road networks, with inference speed approximately 17× faster than the second-best method.
DS-ATGO: Dual-Stage Synergistic Learning via Forward Adaptive Threshold and Backward Gradient Optimization for Spiking Neural Networks: To address spike firing imbalance and gradient vanishing caused by membrane potential distribution shifts during SNN training, this paper proposes DS-ATGO — a dual-stage synergistic learning algorithm combining forward adaptive thresholding (AT) and backward threshold-driven gradient optimization (TGO) — achieving state-of-the-art performance on CIFAR-10/100 and ImageNet with low time-step latency.
Enhancing Control Policy Smoothness by Aligning Actions with Predictions from Preceding States: This paper proposes ASAP (Action Smoothing by Aligning Actions with Predictions from Preceding States), a reinforcement learning action smoothing method based on transition-induced similar state definitions. ASAP suppresses high-frequency action oscillations via a spatial constraint (aligning actions with predictions from the preceding state) and a temporal constraint (penalizing second-order action differences). It outperforms existing methods on Gymnasium and Isaac-Lab benchmarks.
Enhancing Noise Resilience in Face Clustering via Sparse Differential Transformer: This paper proposes a prediction-driven Top-K Jaccard similarity coefficient to improve neighbor purity, combined with a Sparse Differential Transformer (SDT) to eliminate noisy attention, achieving state-of-the-art performance on large-scale face clustering datasets such as MS-Celeb-1M.
Expandable and Differentiable Dual Memories with Orthogonal Regularization for Exemplar-free Continual Learning: This paper proposes EDD (Expandable and Differentiable Dual Memory), an exemplar-free continual learning method that decomposes data into reusable sub-features via differentiable shared and task-specific memories, combined with memory expansion-pruning and orthogonal regularization mechanisms. EDD surpasses 14 state-of-the-art methods on CIFAR-10/100 and Tiny-ImageNet, achieving final accuracies of 55.13%, 37.24%, and 30.11%, respectively.
Expressive Temporal Specifications for Reward Monitoring: This paper leverages quantitative linear temporal logic (LTLf[F]) to automatically synthesize quantitative reward monitors (QRMs) that generate dense, continuous-valued reward streams for reinforcement learning agents at runtime, fundamentally alleviating the sparse reward problem in long-horizon tasks under Boolean semantics.
Extreme Value Monte Carlo Tree Search for Classical Planning: This paper applies Peaks-Over-Threshold Extreme Value Theory (POT EVT) to provide a statistical foundation for Full Bellman Backup in MCTS for classical planning. It proposes the UCB1-Uniform bandit algorithm, which uses MLE under a uniform distribution (a special case of the Generalized Pareto distribution) to guide action selection, outperforming GBFS by 67.8 instances and Softmin-Type(h) by 33.2 instances under a \(10^4\) node budget on Pyperplan.
Faster Certified Symmetry Breaking Using Orders With Auxiliary Variables: By introducing auxiliary variables to encode lexicographic order in place of large-integer encodings, this work fundamentally redesigns the VeriPB proof system, achieving order-of-magnitude speedups in both proof generation and verification for certified SAT symmetry breaking, both theoretically and empirically.
Finding Diverse Solutions Parameterized by Cliquewidth: This paper extends the parameterized framework for finding diverse solutions from treewidth to the strictly more powerful graph parameter cliquewidth, proving that any monotone dynamic programming algorithm parameterized by a cliquewidth decomposition can be converted into an algorithm for the diverse version with minimal overhead. A new family of Venn diversity measures is also proposed.
Forest vs Tree: The (N, K) Trade-off in Reproducible ML Evaluation: This paper investigates the optimal trade-off between the number of samples \(N\) and the number of annotators per sample \(K\) in machine learning evaluation. Under a fixed budget \(N \times K\), by analyzing multi-annotator datasets and simulated distributions, the study finds that \(K > 10\) is generally optimal when annotator disagreement is considered, and the required total budget \(N \times K\) typically does not exceed 1000.
Forget Less by Learning from Parents Through Hierarchical Relationships: This paper proposes FLLP (Forget Less by Learning from Parents), a framework that mitigates catastrophic forgetting in custom diffusion models (CDMs) by establishing parent-child hierarchical relationships among concepts in hyperbolic space. It leverages the tree-structure modeling capability of the Lorentz manifold to preserve knowledge during new concept learning and enable continual concept integration.
Formal Abductive Latent Explanations for Prototype-Based Networks: This paper addresses the problem of misleading explanations in prototype-based networks (e.g., ProtoPNet) by proposing Abductive Latent Explanations (ALE), which construct formally guaranteed sufficient-condition explanations directly in latent space—without invoking external solvers—and scale to standard and fine-grained classification tasks across multiple datasets.
From Decision Trees to Boolean Logic: A Fast and Unified SHAP Algorithm: This paper proposes Woodelf, an algorithm that converts decision tree ensemble models into pseudo-Boolean functions in Weighted Disjunctive Normal Form (WDNF), enabling linear-time computation of both Background SHAP and Path-Dependent SHAP within a unified framework, achieving 16–31× CPU speedup and 24–333× GPU speedup on large-scale datasets.
From Sequential to Recursive: Enhancing Decision-Focused Learning with Bidirectional Feedback: This paper proposes the first Recursive Decision-Focused Learning (R-DFL) framework, which introduces a bidirectional feedback loop between the prediction module and the optimization module, breaking the unidirectional information flow of conventional sequential DFL. Two gradient propagation methods—explicit unrolling and implicit differentiation—are designed, achieving significant improvements in final decision quality on the newsvendor problem and bipartite matching problem.
Guided Perturbation Sensitivity (GPS): Detecting Adversarial Text via Embedding Stability and Word Importance: This paper proposes the Guided Perturbation Sensitivity (GPS) framework, which detects adversarial text samples by masking important words and measuring changes in embedding stability. GPS achieves 85%+ detection accuracy across 3 datasets, 3 attack types, and 2 models, and generalizes across datasets, attacks, and models without retraining.
CAE: Hierarchical Semantic Alignment for Image Clustering: By combining two complementary semantic sources — noun-level (WordNet) and caption-level (Flickr image captions) — and constructing a semantic space via optimal transport alignment followed by adaptive fusion, this work achieves training-free image clustering with a 4.2% accuracy improvement on ImageNet-1K.
Higher-Order Responsibility: This paper studies higher-order responsibility in sequential decision-making mechanisms and establishes two core theorems: (1) any mechanism with \(n\) agents is necessarily \(n\)-gap-free (i.e., a responsible agent can always be found at some order); (2) determining whether a mechanism is \(d\)-gap-free is \(\Pi_{2d+1}\)-complete.
How Hard is it to Explain Preferences Using Few Boolean Attributes?: This paper systematically investigates the computational complexity of explaining preference data using the Boolean Attribute Model (BAM). It proves that the problem is NP-complete for \(k \geq 3\) attributes and solvable in linear time for \(k \leq 2\); further, it provides a complete parameterized complexity landscape with respect to the number of voters \(n\), candidates \(m\), and attributes \(k\), and analyzes how problem hardness changes when partial information (cares/has) is known.
How Hard Is It to Rig a Tournament When Few Players Can Beat or Be Beaten by the Favorite?: This paper introduces two novel structural parameters — the in-degree \(k\) and out-degree \(\ell\) of the favorite player in the tournament digraph — for analyzing the Tournament Fixing Problem (TFP). It proves that TFP is FPT under both parameterizations, where the in-degree algorithm involves sophisticated structural analysis and the color coding technique.
How to Marginalize in Causal Structure Learning?: This paper employs tractable Probabilistic Circuits (PCs) as a replacement for traditional dynamic programming to perform marginalization in Bayesian structure learning. Through a novel two-stage training strategy—first learning full parent set scores and then progressively fine-tuning for marginal queries—the method eliminates the artificial restriction on the number of candidate parent sets, achieving improved posterior distribution estimation within the TRUST framework.
How Wide and How Deep? Mitigating Over-Squashing of GNNs via Channel Capacity Constrained Estimation: From an information-theoretic perspective, this paper models spectral GNNs as communication channels and proposes the Channel Capacity Constrained Estimation (C3E) framework, which formalizes the selection of GNN hidden dimensions and depth as a nonlinear programming problem. The framework estimates optimal architectural parameters prior to training, effectively mitigating over-squashing and consistently improving representation learning across 9 datasets.
HyperSHAP: Shapley Values and Interactions for Explaining Hyperparameter Optimization: HyperSHAP proposes a game-theoretic framework based on Shapley values and Shapley interactions to explain hyperparameter optimization (HPO). By defining four categories of explanation games—ablation, sensitivity, tunability, and optimizer bias—it provides more actionable hyperparameter importance analysis than fANOVA.
I2E: Real-Time Image-to-Event Conversion for High-Performance Spiking Neural Networks: I2E proposes an ultra-efficient image-to-event stream conversion framework that simulates microsaccadic eye movements and implements the conversion via highly parallelized convolutions, achieving over 300× speedup compared to prior methods. It enables online data augmentation during SNN training for the first time, achieves a state-of-the-art 60.50% event-based classification accuracy on I2E-ImageNet, and sets a new record of 92.5% on CIFAR10-DVS through a sim-to-real paradigm of synthetic pretraining followed by real-data fine-tuning.
Improved Differentially Private Algorithms for Rank Aggregation: This paper presents improved approximation algorithms for rank aggregation under differential privacy. It introduces the first study of differentially private footrule rank aggregation with a near-optimal algorithm (which also yields a 2-approximation for the Kemeny problem), and improves the additive error of the Kemeny PTAS by combining two-way marginal queries with an unbiasedness technique (reducing the exponent of \(m\) from 3 to 65/22).
Intermediate N-Gramming: Deterministic and Fast N-Grams For Large N and Large Datasets: This paper proposes Intergrams, a multi-pass scanning algorithm that recursively uses shorter n-grams as prefixes to filter candidates for longer n-grams, fully exploiting the processor cache hierarchy to achieve cache-friendly memory access patterns. On TB-scale datasets, Intergrams achieves 6–33× speedup over the previously fastest hash-gramming method while recovering nearly all top-k n-grams with high accuracy.
Intrinsic Barriers and Practical Pathways for Human-AI Alignment: An Agreement-Based Complexity Analysis: This paper formalizes AI alignment as an \(\langle M,N,\varepsilon,\delta\rangle\)-agreement multi-objective optimization problem, proves information-theoretic lower bounds on alignment (encoding "all human values" is fundamentally intractable) from a communication complexity perspective, and derives explicit achievable algorithms and tight upper bounds for unbounded/bounded rational agents, revealing the theoretical basis for the global inevitability of reward hacking in large state spaces.
Judging by the Rules: Compliance-Aligned Framework for Modern Slavery Statement Monitoring: This paper proposes a training framework centered on a Compliance-Aligned Judge (CA-Judge) that trains a 3B-parameter CALLM model using rule-level alignment feedback, enabling the generation of traceable compliance judgments grounded in statutory provisions. The model surpasses GPT-4o and DeepSeek-R1 on sentence-level compliance classification of modern slavery statements.
LeanRAG: Knowledge-Graph-Based Generation with Semantic Aggregation and Hierarchical Retrieval: This paper proposes LeanRAG, a framework that employs a semantic aggregation algorithm to automatically construct explicit relations among summary nodes in a hierarchical knowledge graph, thereby breaking "semantic islands." Combined with a bottom-up retrieval strategy based on the Lowest Common Ancestor (LCA), LeanRAG efficiently navigates the hierarchical structure, achieving state-of-the-art performance on four QA benchmarks while reducing retrieval redundancy by 46%.
Learning Compact Latent Space for Representing Neural Signed Distance Functions with High-fidelity Geometry Details: This paper proposes a dual-branch architecture (generalization branch + overfitting branch) to learn a compact latent space over multiple neural SDFs. By combining a shared spatial feature grid with a novel bandwidth-based sampling strategy, the method recovers high-fidelity geometric details while maintaining compact latent codes, achieving state-of-the-art performance on Stanford Models, ShapeNet, and D-FAUST.
Learning Fair Representations with Kolmogorov-Arnold Networks: This paper proposes integrating Kolmogorov-Arnold Networks (KAN) into an adversarial debiasing framework, leveraging KAN's spline-based architecture to provide theoretical guarantees of Lipschitz continuity and smoothness. An adaptive \(\lambda\) update mechanism is introduced to dynamically balance fairness and accuracy. The approach achieves significant improvements on fairness metrics on the UCI college admissions dataset.
Learning Network Dismantling Without Handcrafted Inputs: This paper proposes MIND (Message Iteration Network Dismantler), which eliminates the dependence of GNNs on handcrafted features through a novel All-to-One attention mechanism and Message Iteration Profiles. Using only raw adjacency information, MIND achieves state-of-the-art network dismantling performance on real-world networks with up to millions of nodes, while maintaining the lowest computational complexity of \(O(|V|+|E|)\).
Life, Machine Learning, and the Search for Habitability: Predicting Biosignature Fluxes for the Habitable Worlds Observatory: To address the observation prioritization needs of NASA's Habitable Worlds Observatory (HWO), this paper proposes two architectures — a Bayesian Convolutional Neural Network (BCNN) and a novel Spectral Query-Adaptive Transformer (SQuAT) — for predicting biosignature species fluxes from planetary reflected spectra. Both achieve high predictive accuracy on an augmented dataset, with complementary strengths in uncertainty quantification and interpretability, respectively.
LILAD: Learning In-context Lyapunov-stable Adaptive Dynamics Models: This paper proposes LILAD, a framework that leverages the in-context learning (ICL) capability of GPT-2 to jointly learn a dynamics model and a Lyapunov function, achieving adaptive identification of non-stationary parametric dynamical systems while guaranteeing global exponential stability. LILAD outperforms baselines such as ICL and MAML on multiple benchmark systems.
Local Guidance for Configuration-Based Multi-Agent Pathfinding: This paper introduces the concept of Local Guidance (LG) to improve solution quality in LaCAM-based multi-agent pathfinding. By constructing local space-time paths for each agent at every configuration generation step, LG mitigates congestion and reduces solution cost by up to 50%, while maintaining completion within a few seconds for 1,000 agents.
Lost in Time? A Meta-Learning Framework for Time-Shift-Tolerant Physiological Signal Transformation: This paper proposes ShiftSyncNet, a bi-level meta-learning optimization framework that trains a SyncNet to learn temporal offsets between training signal pairs and leverages the Fourier shift theorem to automatically correct label alignment, achieving waveform transformation accuracy improvements of 9.4%, 6.0%, and 12.8% across three datasets respectively.
Measuring Model Performance in the Presence of an Intervention: To address the bias in AI model evaluation under interventions, this paper proposes Nuisance Parameter Weighting (NPW), which applies causal reweighting to the treatment arm of RCT data to achieve unbiased AUROC estimation. The method achieves a 5× improvement in sample efficiency and substantially improves statistical power for model selection and hypothesis testing.
MF-Speech: Achieving Fine-Grained and Compositional Control in Speech Generation via Factor Disentanglement: This paper proposes MF-Speech, a framework that employs multi-objective optimization to disentangle speech signals into three high-purity, independent factor representations—content, timbre, and emotion—and subsequently leverages dynamic fusion and Hierarchical Style Adaptive Normalization (HSAN) to achieve fine-grained, compositional control in speech generation, significantly outperforming existing methods on multi-factor compositional speech generation tasks (WER=4.67%, SECS=0.5685).
MindCross: Fast New Subject Adaptation with Limited Data for Cross-subject Video Reconstruction from Brain Signals: This paper proposes MindCross, a cross-subject brain decoding framework that learns subject-independent information via a shared encoder and subject-specific information via \(N\) individual encoders. Combined with a fast calibration stage and a Top-K collaborative decoding module, a single unified model achieves performance comparable to per-subject models on fMRI/EEG-to-video benchmarks, with new subject adaptation requiring only minimal data and time (~1s vs. 5–17s for baselines).
Model Change for Description Logic Concepts: This paper studies the problem of modifying description logic concepts in response to new model evidence represented as pointed interpretations. It defines three operations — eviction, reception, and revision — and establishes positive and negative compatibility results for the EL and ALC description logics.
Model Counting for Dependency Quantified Boolean Formulas: This paper presents the first study of the model counting problem for Dependency Quantified Boolean Formulas (DQBF). It proves that #2-DQBF — restricted to only two existential variables — is already #EXP-complete, and implements a practical 2-DQBF model counter, sharp2DQR, based on BDD symbolic reachability. The proposed approach significantly outperforms unfolding-based baselines on instances with large dependency sets.
On the Edge of Core (Non-)Emptiness: An Automated Reasoning Approach to Approval-Based Multi-Winner Voting: This paper proposes an automated reasoning framework based on Mixed Integer Linear Programming (MILP) to investigate the major open problem of whether core stability always exists in approval-based multi-winner voting. The framework establishes new existence results, uncovers previously unknown relationships between core stability and other axioms (e.g., Lindahl pricability), and refutes an existing conjecture.
On the Information Processing of One-Dimensional Wasserstein Distances with Finite Samples: This paper analytically characterizes, via a Poisson process framework, the ability of the one-dimensional Wasserstein distance under finite samples to simultaneously encode pointwise density differences (rate difference) and support differences between probability density functions, and validates its practical utility on neural spike data and amino acid contact frequency data.
On the Variability of Concept Activation Vectors: This paper presents the first theoretical analysis of the variability of Concept Activation Vectors (CAVs) in the TCAV framework. It proves that the variance of CAVs decays at a rate of \(O(1/N)\) (where \(N\) is the number of random samples), while the variance of TCAV scores remains \(O(1)\) due to "boundary points," and can only be reduced to \(O(1/s)\) by averaging over multiple runs.
Online Linear Regression with Paid Stochastic Features: This paper studies a novel setting in online linear regression where features are corrupted by noise and the learner can pay to reduce noise intensity. It establishes that the optimal regret rate is \(\widetilde{\mathcal{O}}(\sqrt{T})\) when the noise covariance is known and \(\widetilde{\mathcal{O}}(T^{2/3})\) when unknown, with matching lower bounds; all bounds are order-optimal in \(T\).
Optimal Welfare in Noncooperative Network Formation under Attack: In the noncooperative network formation game model proposed by Goyal et al. (WINE 2016), this paper proves that equilibrium networks created by selfish agents maintain asymptotically optimal social welfare \(n^2 - O(n)\) under a broad class of attackers — including maximum disruption — called super-quadratic disruption (SQD) attackers, thereby resolving a long-standing open problem.
OR-R1: Automating Modeling and Solving of Operations Research Optimization Problems: OR-R1 proposes a data-efficient two-stage training framework (SFT + TGRPO) that achieves an average solving accuracy of 67.7% using only 1/10 of the synthetic data required by ORLM, surpassing existing SOTA methods. Additionally, test-time reinforcement learning reduces the performance gap between single-sample generation (Pass@1) and multi-sample generation (Pass@8) from 13% to 7%.
ParaMETA: Towards Learning Disentangled Paralinguistic Speaking Styles Representations: This paper proposes ParaMETA, a unified paralinguistic speaking style representation learning framework that achieves disentangled representations of speaking styles—including emotion, age, gender, and language—through META space regularization and task-specific subspace projection, while simultaneously supporting downstream multi-task classification and style-controllable speech synthesis.
Parameterized Approximation Algorithms for TSP on Non-Metric Graphs: This paper proposes improved FPT approximation algorithms for the Travelling Salesman Problem (TSP) on non-metric graphs, parameterized by \(p\) (the number of vertices involved in triangle inequality violations) and \(q\) (the size of a minimum violation set), improving the approximation ratio under parameter \(p\) from 2.5 to 1.5 and under parameter \(q\) from 11 to 3.
ParaRevSNN: A Parallel Reversible Spiking Neural Network for Efficient Training and Inference: This paper proposes ParaRevSNN, a parallel reversible spiking neural network architecture that decouples sequential computation constraints by redesigning the data dependencies between reversible blocks, achieving inter-block parallelism while preserving reversibility (memory efficiency). Training time is reduced by up to 35.2% and inference time to 18.15%.
PIPHEN: Physical Interaction Prediction with Hamiltonian Energy Networks: This paper proposes PIPHEN, a distributed physical cognition-control framework that employs a Physical Interaction Prediction Network (PIPN) for "semantic distillation" to compress high-dimensional perceptual data to less than 5% of the original data volume, while a Hamiltonian energy conservation-based HEN controller generates coordinated actions, thereby addressing the "shared brain dilemma" in multi-robot systems.
Predict and Resist: Long-Term Accident Anticipation under Sensor Noise: A unified framework is proposed that integrates a diffusion-based dual-level denoising module with a temporally-aware Actor-Critic reinforcement learning model to enable robust long-term traffic accident anticipation under sensor noise, achieving state-of-the-art performance on three benchmark datasets in terms of both average precision (AP) and mean time-to-accident (mTTA).
Private Frequency Estimation via Residue Number Systems: This paper proposes ModularSubsetSelection (MSS), a local differential privacy frequency estimation protocol based on the Residue Number System (RNS). MSS achieves estimation accuracy comparable to SubsetSelection and PGR while significantly reducing communication overhead (up to 50% less than SS), substantially accelerating server-side decoding (11–448× faster than PGR), and attaining the lowest data reconstruction attack success rate.
Provably Data-Driven Projection Method for Quadratic Programming: This work extends data-driven projection matrix learning from linear programming (LP) to convex quadratic programming (QP). By proposing an "unrolled active set method" to model the computation of QP optimal values within the Goldberg–Jerrum (GJ) framework, it establishes a pseudo-dimension upper bound and generalization guarantees for projection matrix learning.
Radar-APLANC: Unsupervised Radar-based Heartbeat Sensing via Augmented Pseudo-Label and Noise Contrast: This paper proposes Radar-APLANC, the first unsupervised learning framework for radar-based heartbeat sensing. Through a noise contrastive triplet (NCT) loss and an augmented pseudo-label generator, it achieves two-stage unsupervised training without requiring expensive physiological signal annotations, attaining performance approaching supervised methods.
RcAE: Recursive Reconstruction Framework for Unsupervised Industrial Anomaly Detection: This paper proposes a Recursive Convolutional Autoencoder (RcAE) that progressively suppresses anomalies while preserving normal details through multi-step iterative reconstruction with shared parameters. Combined with a Cross-Recursive Detection module (CRD) that exploits multi-step reconstruction dynamics for robust anomaly localization, the method achieves performance comparable to state-of-the-art approaches using only 10% of the parameters required by diffusion models.
Reimagining Anomalies: What if Anomalies Were Normal?: This paper proposes the first counterfactual explanation framework for unsupervised image anomaly detection. By training a generator to modify anomalous samples into multiple disentangled counterfactuals perceived as normal by the detector, the framework answers at the semantic level: "What would an anomaly look like if it were normal?" This provides a depth of interpretability far exceeding traditional heatmap-based approaches.
Rethinking Flow and Diffusion Bridge Models for Speech Enhancement: This paper proposes a unified theoretical framework that subsumes flow matching, score-based diffusion, and Schrödinger bridge models for speech enhancement as processes that construct different Gaussian probability paths between paired data. It further reveals that each sampling step in such generative models is intrinsically equivalent to predictive speech enhancement, and leverages this insight to improve bridge model performance by adopting high-performance backbone networks, refined loss functions, and fine-tuning strategies from the predictive paradigm.
Reward Redistribution via Gaussian Process Likelihood Estimation: This paper proposes GP-LRR, a reward redistribution framework based on Gaussian process likelihood estimation. It explicitly models correlations among state-action pairs via kernel functions, and learns a step-wise reward function by maximizing the marginal likelihood of trajectory returns using a leave-one-out strategy. Theoretical analysis demonstrates that conventional MSE-based methods are a degenerate special case of GP-LRR. Experiments on MuJoCo benchmarks combined with SAC show superior sample efficiency and policy performance.
Scalable Vision-Guided Crop Yield Estimation: This paper proposes a crop yield estimation method based on Prediction-Powered Inference (PPI++), which leverages vision models trained on field photographs to supplement costly ground-truth crop cut measurements. The approach guarantees asymptotic unbiasedness while increasing effective sample size by up to 73%, enabling more accurate and cost-efficient regional yield estimation for agricultural insurance.
Semi-Supervised High Dynamic Range Image Reconstructing via Bi-Level Uncertain Area Masking: This paper proposes a semi-supervised HDR reconstruction framework that evaluates pseudo HDR label quality via an uncertainty estimation branch, masking unreliable regions at both the patch and pixel levels. Using only 6.7% of HDR ground-truth annotations, the method achieves performance comparable to fully supervised state-of-the-art.
ShortageSim: Simulating Drug Shortages under Information Asymmetry: This paper proposes ShortageSim, the first LLM-based multi-agent simulation framework for drug shortages. It models strategic decision-making among FDA regulators, manufacturers, and buyers under information asymmetry, achieving an 84% improvement in predicting resolution lag time on historical shortage data, and provides a controlled testbed for evaluating regulatory strategies.
Shrinking the Teacher: An Adaptive Teaching Paradigm for Asymmetric EEG-Vision Alignment: This paper proposes an Adaptive Teaching Paradigm (ATS) in which a residual-free bottleneck module, ShrinkAdapter, enables the visual "teacher" to actively shrink and restructure its knowledge to match the learning capacity of the EEG "student," achieving 60.2% Top-1 accuracy on zero-shot brain-image retrieval and surpassing the previous SOTA by 9.8 percentage points.
Spike Imaging Velocimetry: Dense Motion Estimation of Fluids Using Spike Cameras: This paper proposes Spike Imaging Velocimetry (SIV), the first systematic application of spike cameras (20,000 Hz ultra-high temporal resolution) to fluid velocimetry. Three fluid-aware modules are designed: Detail-Preserving Hierarchical Transform (DPHT), Graph Encoder (GE), and Multi-Scale Velocity Refinement (MSVR). A new PSSD dataset is constructed, and SIV comprehensively outperforms existing baselines on steady-state turbulence, high-speed flow, and HDR scenarios.
STEM Faculty Perspectives on Generative AI in Higher Education: Through focus group research with 29 STEM faculty at a large public university in the United States, this study reveals how instructors integrate GenAI into teaching, the observed benefits and challenges for student learning, and the institutional support required. A key finding is that GenAI shifts faculty labor from content creation to expert review and may obscure students' underlying competency gaps.
Structural Approach to Guiding a Present-Biased Agent: This paper systematically investigates the parameterized complexity of the T-path-Editing problem within the principal-agent extension of the Kleinberg-Oren model. It presents FPT algorithms parameterized by treewidth and path-cost diversity, establishes tight hardness results, and comprehensively characterizes the tractability-intractability boundary for guiding a present-biased agent to complete critical tasks.
Structure-Aware Encodings of Argumentation Properties for Clique-width: This paper designs directed decomposition-guided (DDG) reductions from abstract argumentation problems to (Q)SAT that linearly preserve clique-width, establishing tractability upper bounds parameterized by clique-width for all standard argumentation semantics (stable, admissible, complete, preferred, semi-stable, stage) across extension existence, argument acceptance, and counting problems. Under the ETH, it further proves that the overhead of these reductions cannot be significantly improved.
SVD-NO: Learning PDE Solution Operators with SVD Integral Kernels: This paper proposes SVD-NO, a neural operator that explicitly parameterizes the SVD decomposition of integral kernels, achieving \(O(ndL)\) linear computational complexity while maintaining high expressiveness, and attaining new state-of-the-art performance on 5 PDE benchmarks.
Symbolic Planning and Multi-Agent Path Finding in Extremely Dense Environments with Unassigned Agents: This paper introduces the Block Rearrangement Problem (BRaP) as a formal problem definition and proposes five solving algorithms based on configuration space search, PDDL symbolic planning, and MAPF. Among them, BR-LaCAM achieves a 92% success rate with millisecond-level solving speed on grids up to 80×80 under extreme density conditions.
SynWeather: Weather Observation Data Synthesis across Multiple Regions and Variables via a General Diffusion Transformer: This work introduces SynWeather, the first unified multi-region multi-variable weather observation synthesis dataset (covering 4 regions × 4 variables × 6 satellites), and proposes SynWeatherDiff, a general probabilistic generative model based on a Diffusion Transformer. By leveraging text prompts to distinguish region–variable task combinations, SynWeatherDiff outperforms both task-specific models and existing general-purpose models across multiple synthesis tasks.
Tab-PET: Graph-Based Positional Encodings for Tabular Transformers: Tab-PET estimates a graph structure from inter-feature correlations in tabular data, constructs positional encodings (PE) from graph Laplacian eigenvectors, and injects them into tabular Transformers. Both theoretical analysis and experiments demonstrate that PE reduces the effective rank of embeddings, thereby improving generalization. Consistent improvements are observed across 50 datasets for TabTransformer, SAINT, and FT-Transformer, with the Spearman correlation graph yielding the best results.
TaylorPODA: A Taylor Expansion-Based Method to Improve Post-Hoc Attributions for Opaque Models: Under the Taylor expansion framework, three postulates—precision, federation, and zero-discrepancy—are proposed to regulate feature attribution. An adaptation property is further introduced to optimize the allocation weights of interaction effects via an AUP objective, making TaylorPODA the only post-hoc, model-agnostic attribution method that simultaneously satisfies all postulates and properties.
TDSNNs: Competitive Topographic Deep Spiking Neural Networks for Visual Cortex Modeling: This paper proposes Topographic Deep Spiking Neural Networks (TDSNNs), which introduce a Spatiotemporal Constraint (STC) loss to successfully replicate the hierarchical topographic organization of the primate visual cortex from V1 to IT in deep SNNs, achieving zero accuracy degradation on ImageNet (top-1) while substantially outperforming existing topographic ANNs in brain similarity.
The Limitations and Power of NP-Oracle-Based Functional Synthesis Techniques: This paper systematically investigates, from a theoretical perspective, the capabilities and limitations of functional synthesis methods that rely on NP oracles. It proves that naive bit-by-bit learning approaches necessarily fail in multi-output settings, that Resolution-interpolation-based methods produce exponential-size circuits, and that an NP oracle is a necessary condition for efficient synthesis. Positive results are also established, showing that NP oracles suffice to synthesize small Skolem functions in polynomial time under appropriate conditions.
The Publication Choice Problem: This paper proposes the "publication choice problem," a game-theoretic framework that models the bidirectional interaction between researchers' publication strategies and venue influence. It proves the existence and uniqueness of pure-strategy equilibria and analyzes the effects of Spotlight paper labels on the academic ecosystem.
Theoretical and Empirical Analysis of Lehmer Codes to Search Permutation Spaces with Evolutionary Algorithms: This work presents the first rigorous mathematical runtime analysis of Lehmer codes (inversion tables) for searching permutation spaces with evolutionary algorithms. It proves that Lehmer-code-based EAs achieve expected runtimes of \(O(n^2 \log n)\) or \(O(n^2)\) on most benchmark functions, matching or improving upon classical representations, and validates practical utility on LOP and QAP instances.
ASAG: Toward the Frontiers of Reliable Diffusion Sampling via Adversarial Sinkhorn Attention Guidance: This paper proposes ASAG (Adversarial Sinkhorn Attention Guidance), which reinterprets self-attention scores in diffusion models from the perspective of optimal transport theory. By injecting adversarial transport costs into attention layers via the Sinkhorn algorithm to deliberately reduce query-key similarity, ASAG systematically disrupts misleading attention alignment and improves both conditional and unconditional sampling quality. The method is lightweight, plug-and-play, and requires no retraining.
Towards Temporal Fusion Beyond the Field of View for Camera-based Semantic Scene Completion: This paper proposes C3DFusion, a module that explicitly aligns point features from historical and current frames in 3D space, and is the first to systematically address temporal completion of out-of-frame (out-of-view) regions in camera-based SSC. The method achieves state-of-the-art performance on SemanticKITTI and SSCBench-KITTI-360.
Tractable Weighted First-Order Model Counting with Bounded Treewidth Binary Evidence: A polynomial-time (in domain size) algorithm is proposed for computing weighted first-order model counting (WFOMC) of the \(\text{FO}^2\) and \(\text{C}^2\) fragments with bounded-treewidth binary evidence, resolving an open problem on counting stable seating arrangements on bounded-treewidth bounded-degree graphs.
Variance Computation for Weighted Model Counting with Knowledge Compilation Approach: This paper treats the weights in weighted model counting (WMC) as random variables with associated variances, and proposes a polynomial-time algorithm for computing WMC variance on structured d-DNNF representations. It further proves intractability of this problem on structured DNNF, d-DNNF, and FBDD (unless P=NP), and applies the framework to quantify parameter uncertainty in Bayesian network inference.
Verification-Guided Context Optimization for Tool Calling via Hierarchical LLMs-as-editors: This paper proposes the VGCO framework, which employs LLMs as hierarchical editors to iteratively optimize tool documentation and knowledge base context through verification-guided signals, achieving significant improvements in retrieval recall, tool selection, and parameter filling accuracy in large-scale tool calling scenarios.
Whispering Agents: An Event-Driven Covert Communication Protocol for the Internet of Agents: This paper presents the first formal definition of a "Covert Event Channel" in the Internet of Agents (IoA) and proposes the ΠCCAP protocol, which embeds secret data across the storage, timing, and behavioral dimensions of agent conversations, achieving high-capacity, high-robustness covert communication that is imperceptible to LLM-based censors.
Why Isn't Relational Learning Taking Over the World?: This position paper systematically analyzes why relational learning has failed to dominate the AI landscape, identifying core issues including unrealistic datasets, fundamentally flawed evaluation methodologies, the absence of negative examples, and theoretical difficulties with aggregation operations. It further delineates the key improvements necessary for relational learning to realize its potential.