🚗 Autonomous Driving¶

🧪 ICML2025 · 10 paper notes

📌 Same area in other venues: 📷 CVPR2026 (157) · 🔬 ICLR2026 (50) · 🧪 ICML2026 (8) · 🤖 AAAI2026 (56) · 🧠 NeurIPS2025 (47) · 📹 ICCV2025 (91)

🔥 Top topics: Point Cloud ×2 · Reinforcement Learning ×2 · Agents ×2

Don't be so Negative! Score-based Generative Modeling with Oracle-assisted Guidance: Proposes the Gen-neG method, which redirects the generative distribution from constraint-violating regions to the positive support region by iteratively training a Bayes-optimal classifier on synthetic data from diffusion models and using it to guide the sampling process. The key innovation lies in correctly handling the importance sampling of class prior probabilities, reducing the collision and out-of-boundary rate from 29.3% to 5.6% in traffic scene generation.
DriveGPT: Scaling Autoregressive Behavior Models for Driving: Proposes DriveGPT, a 1.4B-parameter autoregressive Transformer driving behavior model trained on 120 million real driving clips (50x larger than the largest existing dataset). It systematically establishes the data/model/compute scaling laws for driving behavior modeling for the first time, demonstrates that data is the primary performance bottleneck, and outperforms the state-of-the-art on planning and WOMD prediction tasks.
Geometry-to-Image Synthesis-Driven Generative Point Cloud Registration: Proposes a new paradigm of Generative Point Cloud Registration, designing two registration-tailored controllable 2D generative models: DepthMatch-ControlNet and LiDARMatch-ControlNet, to generate cross-view consistent RGB image pairs from pure geometric point cloud pairs. It plug-and-play improves existing 3D registration methods through geometry-color feature fusion, validated on 3DMatch/ScanNet/Dur360BEV.
GoIRL: Graph-Oriented Inverse Reinforcement Learning for Multimodal Trajectory Prediction: This work integrates the Maximum Entropy Inverse Reinforcement Learning (MaxEnt IRL) framework with vectorized scene representations for the first time, proposing the GoIRL trajectory prediction framework. Utilizing a learnable Feature Adaptor, it aggregates graph features into a grid space to accommodate IRL. It then employs a hierarchical parameterized trajectory generator (Bézier curves + refinement module) along with an MCMC probability fusion mechanism for multimodal trajectory prediction. GoIRL achieves state-of-the-art (SOTA) performance on Argoverse and nuScenes, demonstrating significantly stronger generalization capabilities compared to supervised models.
Hierarchical and Collaborative LLM-Based Control for Multi-UAV Motion and Communication in Integrated Terrestrial and Non-Terrestrial Networks: Proposes a hierarchical collaborative LLM-based control framework that coordinates dual-level LLMs—a meta-controller LLM deployed on the HAPS and edge-controller LLMs deployed on the UAVs—to achieve joint optimization of motion planning and communication access for multi-UAVs in 3D aerial highway scenarios.
Hybrid Quantum-Classical Multi-Agent Pathfinding: Proposed the first optimal hybrid quantum-classical MAPF algorithms, QP and QCP, converting the path selection problem of MAPF into QUBO subproblems solvable on quantum hardware. By utilizing a conflict graph and column generation framework, theoretical optimality is achieved, and feasibility is validated on real quantum hardware.
InfoCons: Identifying Interpretable Critical Concepts in Point Clouds via Information Theory: This paper proposes the InfoCons framework, which applies the Information Bottleneck (IB) principle to interpretable point cloud models. By training an attention bottleneck network, the framework decomposes point clouds into 3D concepts of varying importance. It introduces a learnable, unbiased prior to replace the fixed prior, generating conceptually cohesive explanations while ensuring faithfulness to model predictions.
R3DM: Enabling Role Discovery and Diversity Through Dynamics Models in Multi-agent Reinforcement Learning: Proposes the R3DM framework, which balances role diversity and coordination by maximizing the mutual information between agent roles, historical trajectories, and future expected behaviors, leveraging intrinsic rewards driven by dynamics models. It improves the win rate by up to 20% in SMAC/SMACv2 environments.
SafeMap: Robust HD Map Construction from Incomplete Observations: SafeMap proposes a plug-and-play robust framework for HD map construction. By utilizing two modules, Gaussian-based Perspective View Reconstruction (G-PVR) and Distillation-based BEV Correction (D-BEVC), it accurately constructs vectorized HD maps even under incomplete observations where camera views are missing.
SPHINX: Structural Prediction using Hypergraph Inference Network: This paper proposes SPHINX, an unsupervised hypergraph inference model that frames hyperedge discovery as a sequential soft clustering problem. By employing differentiable k-subset sampling, SPHINX generates discrete, sparse hypergraph structures that can be seamlessly integrated into any hypergraph neural network. SPHINX achieves a 90% overlap rate in hypergraph reconstruction on synthetic data and outperforms existing methods in NBA trajectory prediction and 3D object classification.