DAPointMamba: Domain Adaptive Point Mamba for Point Cloud Completion¶

Conference: AAAI 2026 arXiv: 2511.20278 Code: None Area: 3D Vision Keywords: Point Cloud Completion, Domain Adaptation, State Space Model, Mamba, Cross-Domain Alignment

TL;DR¶

This work presents the first integration of Mamba (SSM) into unsupervised domain adaptive point cloud completion (UDA PCC). The proposed DAPointMamba framework achieves high-quality cross-domain point cloud completion through three modules—Cross-Domain Patch-Level Scanning, Spatial SSM Alignment, and Channel SSM Alignment—while maintaining linear complexity and a global receptive field.

Background & Motivation¶

Point cloud completion (PCC) is a fundamental task in 3D vision with broad applications in autonomous driving, robotics, and virtual reality. However, existing supervised methods suffer significant performance degradation under cross-domain deployment due to distribution shifts caused by heterogeneous sensors and scenes.

Existing UDA PCC methods face two primary bottlenecks:

CNN architectures are constrained by local receptive fields: Convolution-based backbones cannot model global geometric structures, limiting the learning of domain-invariant features.

Transformer architectures suffer from quadratic complexity: Although DAPoinTr introduces global modeling, the quadratic complexity of attention mechanisms leads to poor computational efficiency, particularly for long patch sequences.

Mamba/SSM models naturally offer global receptive fields and linear complexity, yet directly applying SSM to UDA PCC encounters the following challenges:

Spatial topology disruption: Directly serializing sparse, unstructured 3D point clouds into 1D sequences destroys spatial topology and local geometric features.
Absence of domain-invariant feature design: Existing Point Mamba architectures lack specialized designs for domain transfer.

Method¶

Overall Architecture¶

DAPointMamba consists of three core components that form a hierarchical cross-domain alignment system, progressing from local to global:

Cross-Domain Patch-Level Scanning (CDPS): Ensures spatial correspondence during serialization.
Cross-Domain Spatial SSM Alignment (CDSA): Addresses fine-grained spatial inconsistency.
Cross-Domain Channel SSM Alignment (CDCA): Addresses global semantic inconsistency.

Key Designs¶

CDPS: Cross-Domain Patch-Level Scanning¶

The core idea of CDPS is to ensure spatial alignment of patches between the source and target domains via shared coordinate normalization and Z-order serialization.

Specific steps: 1. Compute the shared minimum coordinate across both domains: \(C_{min} = min(min(X_s, dim=1), min(X_t, dim=1))\) 2. Normalize and discretize into a shared grid space: \(G_s = [X_s - C_{min} * scale]\), \(G_t = [X_t - C_{min} * scale]\) 3. Map 3D coordinates to 1D sequences using a consistent Z-order curve encoding. 4. After sorting by Z-order values, divide the sequence into \(G\) patches, each containing \(K\) points.

Through unified normalization and Z-order serialization, the \(g\)-th patch corresponds to the same spatial region in both domains, enabling precise patch-level alignment.

CDSA: Cross-Domain Spatial SSM Alignment¶

CDSA enforces local spatial alignment via similarity-based feature modulation:

Apply depthwise separable 1D convolution to patch-level features: \(\mathcal{D}_s = DWConv(P_s^G)\), \(\mathcal{D}_t = DWConv(P_t^G)\)
Compute cosine similarity as a spatial similarity weight: \(\mathcal{W}_{spatial} = cos(D_s, D_t)\)
Modulate patch features using the similarity weight: \(\tilde{X}_s = P_s^G \odot W_{spatial}\)
Encourage cross-domain local feature consistency via MSE loss: \(\mathcal{L}_{sp} = \frac{1}{BDG}\sum(\tilde{X}_s - \tilde{X}_t)^2\)

Design intuition: High-similarity regions are preserved while features in low-similarity regions are suppressed, guiding the model to focus on spatially consistent structures across domains.

CDCA: Cross-Domain Channel SSM Alignment¶

CDCA targets global semantic inconsistency through channel mixing and adaptive modulation:

Global feature computation: Average over patches to obtain \(g_s, g_t \in \mathbb{R}^{B \times D}\)
Alignment strength estimation: \(\alpha = Sigmoid(MLP([g_s, g_t])) \in \mathbb{R}^{B \times 1}\)
Channel cross-mixing: Divide feature channels into \(S\) segments and interleave source and target domain segments:
- \(X_{s,mix} = [X_s^{(1)}, X_t^{(2)}, X_s^{(3)}, X_t^{(4)}, \cdots]\)
- Adaptive similarity modulation: Compute cosine similarity of mixed representations and combine with \(\alpha\) to generate adaptive weights.
- Channel alignment loss: \(\mathcal{L}_{ch} = \frac{1}{BDG}\sum(\tilde{F}_s - \tilde{F}_t)^2\)

Design highlight: Cross-mixing of information dissolves domain boundaries, enabling mutual awareness between the global semantic features of the source and target domains.

Loss & Training¶

Total loss function:

\[\mathcal{L}_{total} = Loss_{(CD)} + \lambda L_{sp} + \beta L_{ch}\]

\(Loss_{(CD)}\): Chamfer Distance reconstruction loss
\(\lambda = 0.1\), \(\beta = 0.1\)
Initial learning rate \(1 \times 10^{-3}\), weight decay \(5 \times 10^{-2}\), batch size 32
Backbone: refinement module of PointMamba

Key Experimental Results¶

Main Results¶

3D-FUTURE Dataset (CD↓, ×10⁴):

Method	Avg	Cabinet	Chair	Lamp	Sofa	Table
DAPoinTr	22.35	18.46	17.60	27.91	23.08	24.71
DAPointMamba	20.40	19.35	16.21	22.81	22.38	21.25

ModelNet Dataset (CD↓, ×10⁴):

Method	Avg	Plane	Car	Chair	Lamp	Sofa	Table
DAPoinTr	13.79	2.38	8.04	13.83	33.26	12.72	12.51
DAPointMamba	13.11	2.30	7.58	13.15	32.04	12.48	11.08

Real-World Scans (UCD↓/UHD↓, ×10⁴/×10²):

Method	ScanNet-Chair	KITTI-Car
DAPoinTr	1.1/2.7	0.45/1.8
DAPointMamba	0.95/2.8	0.40/2.1

Ablation Study¶

Incremental module contribution (3D-FUTURE Avg CD↓):

Baseline	+CDPS	+CDSA	+CDCA
23.38	21.73	21.17	20.40

Computational efficiency comparison:

Model	Params(M)	FLOPs(G)	Time(ms)
DAPoinTr	36.904	24.912	23.774
DAPointMamba	9.571	5.192	3.820

Key Findings¶

Compared to DAPoinTr, DAPointMamba reduces parameter count by 74%, FLOPs by 79%, and inference latency by 84%.
CDPS contributes the largest individual gain (−1.65 CD); CDCA yields the most notable improvements on high-variance categories (lamp, table).
On real-world scan data, UCD consistently outperforms prior methods; however, UHD (maximum point error) is slightly inferior, as the method optimizes overall shape rather than extreme points.

Highlights & Insights¶

First exploration of SSM adaptation for UDA PCC, filling a research gap for Mamba in domain-adaptive point cloud tasks.
The three-tier alignment architecture is elegantly structured: CDPS (patch spatial correspondence) → CDSA (fine-grained spatial alignment) → CDCA (global semantic alignment), progressing systematically from local to global.
Shared Z-order curve normalization is a concise yet effective means of cross-domain spatial alignment.
Balance between linear complexity and high accuracy: DAPointMamba surpasses the Transformer-based baseline in performance (CD reduced by 1.95) while substantially lowering computational cost.

Limitations & Future Work¶

Performance on the UHD metric (maximum point error) remains moderate, potentially requiring specialized handling of boundary points.
Evaluation is limited to synthetic→real and synthetic→synthetic transfer; real→real scenarios are not assessed.
The channel mixing strategy (alternating even/odd segments) is relatively fixed; adaptive mixing ratios warrant further exploration.
The Cabinet category underperforms relative to DAPoinTr, indicating that adaptation for certain geometric shapes still has room for improvement.

DAPoinTr (SOTA baseline): A pioneering Transformer-based approach for UDA PCC, but quadratic complexity remains a bottleneck.
PointMamba: Validates the effectiveness of SSM for point cloud analysis; DAPointMamba extends this by introducing domain adaptation capability.
Z-order curve: A classical spatial indexing method, here cleverly repurposed for unified serialization to achieve cross-domain spatial alignment.
Inspiration: The adaptability of SSM/Mamba to other 3D cross-domain tasks (e.g., detection, segmentation) merits further investigation.

Rating¶

Novelty: ⭐⭐⭐⭐ (First integration of Mamba into UDA PCC; three modules exhibit notable originality)
Experimental Thoroughness: ⭐⭐⭐⭐ (Multiple benchmarks + real-world data + efficiency comparison + visualization)
Writing Quality: ⭐⭐⭐⭐ (Clear logic, rich figures and tables)
Value: ⭐⭐⭐⭐ (Opens a new direction for Mamba in domain-adaptive 3D tasks)