DAPointMamba: Domain Adaptive Point Mamba for Point Cloud Completion¶
Conference: AAAI 2026 arXiv: 2511.20278 Code: None Area: 3D Vision Keywords: Point Cloud Completion, Domain Adaptation, State Space Model, Mamba, Cross-Domain Alignment
TL;DR¶
This work presents the first integration of Mamba (SSM) into unsupervised domain adaptive point cloud completion (UDA PCC). The proposed DAPointMamba framework achieves high-quality cross-domain point cloud completion through three modules—Cross-Domain Patch-Level Scanning, Spatial SSM Alignment, and Channel SSM Alignment—while maintaining linear complexity and a global receptive field.
Background & Motivation¶
Point cloud completion (PCC) is a fundamental task in 3D vision with broad applications in autonomous driving, robotics, and virtual reality. However, existing supervised methods suffer significant performance degradation under cross-domain deployment due to distribution shifts caused by heterogeneous sensors and scenes.
Existing UDA PCC methods face two primary bottlenecks:
CNN architectures are constrained by local receptive fields: Convolution-based backbones cannot model global geometric structures, limiting the learning of domain-invariant features.
Transformer architectures suffer from quadratic complexity: Although DAPoinTr introduces global modeling, the quadratic complexity of attention mechanisms leads to poor computational efficiency, particularly for long patch sequences.
Mamba/SSM models naturally offer global receptive fields and linear complexity, yet directly applying SSM to UDA PCC encounters the following challenges:
- Spatial topology disruption: Directly serializing sparse, unstructured 3D point clouds into 1D sequences destroys spatial topology and local geometric features.
- Absence of domain-invariant feature design: Existing Point Mamba architectures lack specialized designs for domain transfer.
Method¶
Overall Architecture¶
DAPointMamba consists of three core components that form a hierarchical cross-domain alignment system, progressing from local to global:
- Cross-Domain Patch-Level Scanning (CDPS): Ensures spatial correspondence during serialization.
- Cross-Domain Spatial SSM Alignment (CDSA): Addresses fine-grained spatial inconsistency.
- Cross-Domain Channel SSM Alignment (CDCA): Addresses global semantic inconsistency.
Key Designs¶
CDPS: Cross-Domain Patch-Level Scanning¶
The core idea of CDPS is to ensure spatial alignment of patches between the source and target domains via shared coordinate normalization and Z-order serialization.
Specific steps: 1. Compute the shared minimum coordinate across both domains: \(C_{min} = min(min(X_s, dim=1), min(X_t, dim=1))\) 2. Normalize and discretize into a shared grid space: \(G_s = [X_s - C_{min} * scale]\), \(G_t = [X_t - C_{min} * scale]\) 3. Map 3D coordinates to 1D sequences using a consistent Z-order curve encoding. 4. After sorting by Z-order values, divide the sequence into \(G\) patches, each containing \(K\) points.
Through unified normalization and Z-order serialization, the \(g\)-th patch corresponds to the same spatial region in both domains, enabling precise patch-level alignment.
CDSA: Cross-Domain Spatial SSM Alignment¶
CDSA enforces local spatial alignment via similarity-based feature modulation:
- Apply depthwise separable 1D convolution to patch-level features: \(\mathcal{D}_s = DWConv(P_s^G)\), \(\mathcal{D}_t = DWConv(P_t^G)\)
- Compute cosine similarity as a spatial similarity weight: \(\mathcal{W}_{spatial} = cos(D_s, D_t)\)
- Modulate patch features using the similarity weight: \(\tilde{X}_s = P_s^G \odot W_{spatial}\)
- Encourage cross-domain local feature consistency via MSE loss: \(\mathcal{L}_{sp} = \frac{1}{BDG}\sum(\tilde{X}_s - \tilde{X}_t)^2\)
Design intuition: High-similarity regions are preserved while features in low-similarity regions are suppressed, guiding the model to focus on spatially consistent structures across domains.
CDCA: Cross-Domain Channel SSM Alignment¶
CDCA targets global semantic inconsistency through channel mixing and adaptive modulation:
- Global feature computation: Average over patches to obtain \(g_s, g_t \in \mathbb{R}^{B \times D}\)
- Alignment strength estimation: \(\alpha = Sigmoid(MLP([g_s, g_t])) \in \mathbb{R}^{B \times 1}\)
-
Channel cross-mixing: Divide feature channels into \(S\) segments and interleave source and target domain segments:
- \(X_{s,mix} = [X_s^{(1)}, X_t^{(2)}, X_s^{(3)}, X_t^{(4)}, \cdots]\)
- Adaptive similarity modulation: Compute cosine similarity of mixed representations and combine with \(\alpha\) to generate adaptive weights.
- Channel alignment loss: \(\mathcal{L}_{ch} = \frac{1}{BDG}\sum(\tilde{F}_s - \tilde{F}_t)^2\)
Design highlight: Cross-mixing of information dissolves domain boundaries, enabling mutual awareness between the global semantic features of the source and target domains.
Loss & Training¶
Total loss function:
- \(Loss_{(CD)}\): Chamfer Distance reconstruction loss
- \(\lambda = 0.1\), \(\beta = 0.1\)
- Initial learning rate \(1 \times 10^{-3}\), weight decay \(5 \times 10^{-2}\), batch size 32
- Backbone: refinement module of PointMamba
Key Experimental Results¶
Main Results¶
3D-FUTURE Dataset (CD↓, ×10⁴):
| Method | Avg | Cabinet | Chair | Lamp | Sofa | Table |
|---|---|---|---|---|---|---|
| DAPoinTr | 22.35 | 18.46 | 17.60 | 27.91 | 23.08 | 24.71 |
| DAPointMamba | 20.40 | 19.35 | 16.21 | 22.81 | 22.38 | 21.25 |
ModelNet Dataset (CD↓, ×10⁴):
| Method | Avg | Plane | Car | Chair | Lamp | Sofa | Table |
|---|---|---|---|---|---|---|---|
| DAPoinTr | 13.79 | 2.38 | 8.04 | 13.83 | 33.26 | 12.72 | 12.51 |
| DAPointMamba | 13.11 | 2.30 | 7.58 | 13.15 | 32.04 | 12.48 | 11.08 |
Real-World Scans (UCD↓/UHD↓, ×10⁴/×10²):
| Method | ScanNet-Chair | KITTI-Car |
|---|---|---|
| DAPoinTr | 1.1/2.7 | 0.45/1.8 |
| DAPointMamba | 0.95/2.8 | 0.40/2.1 |
Ablation Study¶
Incremental module contribution (3D-FUTURE Avg CD↓):
| Baseline | +CDPS | +CDSA | +CDCA |
|---|---|---|---|
| 23.38 | 21.73 | 21.17 | 20.40 |
Computational efficiency comparison:
| Model | Params(M) | FLOPs(G) | Time(ms) |
|---|---|---|---|
| DAPoinTr | 36.904 | 24.912 | 23.774 |
| DAPointMamba | 9.571 | 5.192 | 3.820 |
Key Findings¶
- Compared to DAPoinTr, DAPointMamba reduces parameter count by 74%, FLOPs by 79%, and inference latency by 84%.
- CDPS contributes the largest individual gain (−1.65 CD); CDCA yields the most notable improvements on high-variance categories (lamp, table).
- On real-world scan data, UCD consistently outperforms prior methods; however, UHD (maximum point error) is slightly inferior, as the method optimizes overall shape rather than extreme points.
Highlights & Insights¶
- First exploration of SSM adaptation for UDA PCC, filling a research gap for Mamba in domain-adaptive point cloud tasks.
- The three-tier alignment architecture is elegantly structured: CDPS (patch spatial correspondence) → CDSA (fine-grained spatial alignment) → CDCA (global semantic alignment), progressing systematically from local to global.
- Shared Z-order curve normalization is a concise yet effective means of cross-domain spatial alignment.
- Balance between linear complexity and high accuracy: DAPointMamba surpasses the Transformer-based baseline in performance (CD reduced by 1.95) while substantially lowering computational cost.
Limitations & Future Work¶
- Performance on the UHD metric (maximum point error) remains moderate, potentially requiring specialized handling of boundary points.
- Evaluation is limited to synthetic→real and synthetic→synthetic transfer; real→real scenarios are not assessed.
- The channel mixing strategy (alternating even/odd segments) is relatively fixed; adaptive mixing ratios warrant further exploration.
- The Cabinet category underperforms relative to DAPoinTr, indicating that adaptation for certain geometric shapes still has room for improvement.
Related Work & Insights¶
- DAPoinTr (SOTA baseline): A pioneering Transformer-based approach for UDA PCC, but quadratic complexity remains a bottleneck.
- PointMamba: Validates the effectiveness of SSM for point cloud analysis; DAPointMamba extends this by introducing domain adaptation capability.
- Z-order curve: A classical spatial indexing method, here cleverly repurposed for unified serialization to achieve cross-domain spatial alignment.
- Inspiration: The adaptability of SSM/Mamba to other 3D cross-domain tasks (e.g., detection, segmentation) merits further investigation.
Rating¶
- Novelty: ⭐⭐⭐⭐ (First integration of Mamba into UDA PCC; three modules exhibit notable originality)
- Experimental Thoroughness: ⭐⭐⭐⭐ (Multiple benchmarks + real-world data + efficiency comparison + visualization)
- Writing Quality: ⭐⭐⭐⭐ (Clear logic, rich figures and tables)
- Value: ⭐⭐⭐⭐ (Opens a new direction for Mamba in domain-adaptive 3D tasks)