Skip to content

🛰️ Remote Sensing

🧠 NeurIPS2025 · 11 paper notes

C3PO: Cross-View Cross-Modality Correspondence by Pointmap Prediction

This paper introduces the C3 dataset comprising 90K ground photo–floor plan pairs (597 scenes, 153M pixel-level correspondences, and 85K camera poses), exposes the limitations of existing correspondence models under cross-view cross-modality settings (e.g., ground photos vs. floor plans), and demonstrates that training on this dataset reduces the RMSE of the best-performing baseline by 34%.

ChA-MAEViT: Unifying Channel-Aware Masked Autoencoders and Multi-Channel Vision Transformers for Improved Cross-Channel Learning

This paper proposes ChA-MAEViT, which enhances cross-channel feature learning for multi-channel images (MCI) through four key components: dynamic channel-patch joint masking, memory tokens, hybrid token fusion, and a channel-aware decoder. The method outperforms the state of the art by an average of 3.0–21.5% across three satellite and microscopy datasets.

Connecting the Dots: A Machine Learning Ready Dataset for Ionospheric Forecasting Models

As a product of the 2025 NASA Frontier Development Lab (FDL) Heliolab program, this paper presents the first comprehensive ML-ready dataset for ionospheric forecasting. It unifies seven categories of heterogeneous data sources — Solar Dynamics Observatory (SDO) extreme ultraviolet (EUV) irradiance embeddings, solar wind parameters, interplanetary magnetic field (IMF), geomagnetic activity indices, JPL dense TEC global ionospheric maps (GIMs), Madrigal sparse TEC, solar flux indices, and orbital mechanics parameters — into a consistent temporal-spatial structure. Building on this dataset, multiple spatiotemporal forecasting architectures are trained, including LSTM, Spherical Fourier Neural Operator (SFNO), and GraphCast, achieving autoregressive prediction of global vertical total electron content (vTEC) up to 12 hours ahead under both quiet and geomagnetically active conditions, surpassing the persistence baseline.

EcoCast: A Spatio-Temporal Model for Continual Biodiversity and Climate Risk Forecasting

This paper proposes EcoCast, a Transformer-based spatio-temporal sequence model that integrates satellite remote sensing (Sentinel-2), climate reanalysis (ERA5), and citizen science observations (GBIF). The model predicts next-month species occurrence probabilities from 12-month environmental feature sequences. On a five-species African bird distribution prediction task, the macro-average F1 score improves from 0.31 (Random Forest) to 0.65. An EWC-based continual learning framework is also designed to accommodate data updates.

GeoLink: Empowering Remote Sensing Foundation Model with OpenStreetMap Data

GeoLink directly integrates OpenStreetMap vector data into remote sensing foundation model pretraining by encoding OSM data with a heterogeneous GNN and designing multi-granularity cross-modal learning objectives (region–image-level contrastive + object–patch-level fusion). Pretrained efficiently on 1.27 million sample pairs, GeoLink surpasses existing RS FMs across 7 classification and 4 segmentation/change detection benchmarks.

GreenHyperSpectra: A Multi-Source Hyperspectral Dataset for Global Vegetation Trait Prediction

GreenHyperSpectra constructs a pretraining dataset of 140,000+ multi-source hyperspectral vegetation samples spanning proximal, airborne, and satellite platforms. Label-efficient regression models trained via semi-supervised and self-supervised methods (MAE, GAN, RTM-AE) comprehensively outperform fully supervised baselines on 7 plant trait prediction tasks, with particularly pronounced advantages under label-scarce and out-of-distribution scenarios.

Mass Conservation on Rails – Rethinking Physics-Informed Learning of Ice Flow Vector Fields

This paper proposes a divergence-free neural network (dfNN) that architecturally enforces exact mass conservation (divergence identically zero) via the symplectic gradient of a stream function, and combines it with a directional guidance learning strategy. The approach significantly outperforms soft-constraint PINNs and unconstrained NNs on ice flux interpolation over Antarctica's Byrd Glacier.

OrbitZoo: Real Orbital Systems Challenges for Reinforcement Learning

This paper presents OrbitZoo, a multi-agent RL environment built on the industrial-grade Orekit orbital mechanics library, supporting realistic orbital tasks such as collision avoidance, Hohmann transfers, and constellation coordination. It provides standardized MARL training through the PettingZoo interface, and achieves 24-meter RMSE (over a 16.6-hour propagation) for the low-error group in validation against real Starlink ephemeris data.

OrthoLoC: UAV 6-DoF Localization and Calibration Using Orthographic Geodata

OrthoLoC introduces the first large-scale UAV 6-DoF localization benchmark built upon orthographic geodata (DOP+DSM), comprising 16,425 real UAV images across 47 regions in Germany and the United States. It further proposes AdHoP (Adaptive Homography Preprocessing), a plug-and-play matching enhancement that improves matching performance by 95% and reduces translation error by 63% without modifying the underlying feature matcher.

RSCC: A Large-Scale Remote Sensing Change Caption Dataset for Disaster Events

This work introduces RSCC — the first large-scale disaster-aware remote sensing change captioning dataset comprising 62,351 pre/post-disaster image pairs with detailed change descriptions, covering 31 global disaster events including earthquakes, floods, and wildfires. High-quality annotations are generated using the QvQ-Max visual reasoning model, and a comprehensive benchmark evaluation framework is established.

Scaling Image Geo-Localization to Continent Level

A hybrid approach combining classification-learned prototypes with aerial image embeddings achieves 68%+ recall@1 within 200 m and 59.2% within 100 m across 433,000 km² of Western Europe — the first system to attain such precision at continental scale.