Radar-APLANC: Unsupervised Radar-based Heartbeat Sensing via Augmented Pseudo-Label and Noise Contrast¶

Conference: AAAI 2026 arXiv: 2511.08071 Code: https://github.com/RadarHRSensing/Radar-APLANC Area: Other Keywords: radar heartbeat sensing, unsupervised learning, pseudo-label, noise contrastive learning, FMCW radar

TL;DR¶

This paper proposes Radar-APLANC, the first unsupervised learning framework for radar-based heartbeat sensing. Through a noise contrastive triplet (NCT) loss and an augmented pseudo-label generator, it achieves two-stage unsupervised training without requiring expensive physiological signal annotations, attaining performance approaching supervised methods.

Background & Motivation¶

Value and Challenges of Radar-based Heartbeat Sensing¶

FMCW radar enables contactless heartbeat sensing by detecting sub-millimeter (0.1–0.5 mm) chest wall displacements, offering unique advantages in privacy preservation, environmental robustness, and continuous monitoring. However:

Noise sensitivity of traditional methods: Phase extraction and unwrapping-based approaches suffer severe performance degradation under motion artifacts, multipath interference, and low SNR conditions, with phase wrapping ambiguity and noise sensitivity being fundamental limitations.

Annotation bottleneck for supervised methods: Deep learning methods (e.g., Equipleth RF, VitaNet, CardiacMamba) offer better noise robustness but require large-scale, high-quality physiological signal annotations (e.g., synchronized PPG signals), whose collection cost is prohibitive and limits training data scalability.

Non-transferability of video-domain methods: Unsupervised physiological monitoring methods from the video domain (e.g., contrastive learning paradigms) exist, but radar heartbeat signals have lower SNR, a different representation (chest wall motion vs. facial color changes), and conventional positive/negative sample construction strategies fail under strong noise.

Core Insight¶

Radar range matrices inherently contain a contrastive structure between "heartbeat range bins" and "noise range bins"—this intrinsic signal-noise separation can be exploited to construct positive and negative samples without external annotations.

Method¶

Overall Architecture¶

Radar-APLANC is a two-stage unsupervised framework: - Stage 1: Pseudo-labels are generated using conventional radar methods; pre-training is performed with the noise contrastive triplet (NCT) loss. - Stage 2: An augmented pseudo-label generator is introduced to improve pseudo-label quality via quality assessment and adaptive noise-aware label selection, enabling further fine-tuning.

Preliminaries¶

Range matrix acquisition: An FMCW radar transmits a linearly frequency-modulated signal $s(t)$; the received reflected signal $u(t)$ is IQ-demodulated to obtain the intermediate frequency signal $m(t)$. FFT is applied to each chirp's IF signal to yield the range profile $M_n[f]$, and $N$ chirps are concatenated to form the range matrix $M \in \mathbb{R}^{N \times D}$.

Basic heartbeat sensing: (1) Select the range bin $d^*$ with the maximum power ratio (human body location); (2) compute the phase signal of that bin; (3) apply phase unwrapping; (4) apply 0.8–3.0 Hz bandpass filtering to obtain the heartbeat signal $\Phi(\cdot) \in \mathbb{R}^N$.

Key Designs¶

1. Noise Contrastive Triplet Loss (NCT Loss) — Stage 1¶

Mechanism: The natural contrast between heartbeat bins and noise bins in radar range matrices is exploited to construct self-supervised learning signals.

Pseudo-label generation: Conventional radar methods extract heartbeat signals from the central range bin $d^*$; random temporal sampling and power spectral density (PSD) transformation yield the pseudo-label set $S_{PL}$.
Positive sample construction: Heartbeat matrices $M(\cdot, d^* \pm \Delta d)$ from bins within a window around the central bin are fed into a heartbeat extractor, producing predicted heartbeat signals $p(\cdot)$; PSD transformation yields positive sample set $S_P$.
Negative sample construction: Windows from randomly selected non-central bins $d'$ are treated as noise matrices and fed into a noise extractor to produce noise signals $q(\cdot)$; PSD transformation yields negative sample set $S_N$.

NCT Loss: $$\mathcal{L}_{NCT} = \underbrace{\frac{1}{K^2}\sum_{i,j}\|S_{PL}[i] - S_P[j]\|^2}_{\text{positive term: pull heartbeat toward pseudo-label}} + \underbrace{(-\frac{1}{K^2}\sum_{i,j}\|S_P[i] - S_N[j]\|^2)}_{\text{negative term: push heartbeat away from noise}}$$

Design Motivation: All traditional signal processing methods can be regarded as a form of pseudo-label generator; despite being noisy, they still carry heartbeat information. Range bins at non-body positions predominantly contain background noise, naturally serving as negative samples. PSD transformation enables more stable frequency-domain comparison.

2. Augmented Pseudo-Label Generator — Stage 2¶

Mechanism: The Stage 1 pre-trained model is used to assess and select higher-quality pseudo-labels.

The module consists of two sub-components:

Quality assessment module: - Conventional heartbeat signals $\{\Phi_1, \ldots, \Phi_{2\Delta d+1}\}$ are extracted from each of the $2\Delta d + 1$ range bins within the heartbeat window. - Noise distance $X_i = D(\Phi_i, q)$: distance from the pre-trained noise signal; larger values indicate better signal quality. - Heartbeat distance $Y_i = D(\Phi_i, p)$: distance from the pre-trained heartbeat signal; smaller values indicate better signal quality. - The distance metric is the mean absolute error between the estimated heart rates of two signals.

Decision module (adaptive noise-aware label selection): - Ideal case: $\arg\max_i X_i = \arg\min_i Y_i$ (maximum noise distance coincides with minimum heartbeat distance); the corresponding signal is directly selected. - Conflict case: $\arg\max_i X_i \neq \arg\min_i Y_i$; the noise distance of the signal $\Phi_{\arg\min Y_i}$ (minimum heartbeat distance) is compared against the noise distance of the pre-trained heartbeat signal $D(p, q)$. - If greater: select $\Phi_{\arg\min Y_i}$ (conventional method is preferred). - Otherwise: select the pre-trained heartbeat signal $p$ (deep learning method is preferred).

Loss & Training¶

Both Stage 1 and Stage 2 use the NCT Loss; the only difference is the source of pseudo-labels.
Both stages train the heartbeat extractor and the noise extractor.
Optimizer: AdamW, learning rate 1e-4, trained for 200 epochs.
Evaluation uses 10-second windows.

Key Experimental Results¶

Main Results¶

Two datasets are used: Equipleth (public, 91 subjects, 550 recordings) and RHB (self-collected, 80 subjects, 240 recordings).

Method	Type	Equipleth MAE↓	Equipleth r↑	RHB MAE↓	RHB r↑
FFT-based RF	Traditional	13.51	0.24	12.25	0.26
Equipleth RF	Supervised	2.18	0.89	3.19	0.82
VitaNet	Supervised	3.14	0.77	5.28	0.66
mmFormer	Supervised	6.50	0.52	8.89	0.28
Radar-APLANC	Unsupervised	3.95	0.64	3.92	0.77

Cross-dataset evaluation (generalization):

Method	RHB→Equipleth MAE	Equipleth→RHB MAE
Equipleth RF	4.53 (+107.8%)	2.68
VitaNet	7.43 (+136.6%)	2.38
Radar-APLANC	4.10 (+3.8%)	3.52

Ablation Study¶

Stage 1 Config	Stage 2 Config	MAE↓	RMSE↓	r↑
Noise matrix only	—	34.48	38.34	0.01
Pseudo-label only	—	8.94	15.88	0.30
Noise + pseudo-label	—	4.40	9.89	0.63
Noise + pseudo-label	Augmented pseudo-label only	7.42	13.61	0.38
Noise + pseudo-label	Augmented pseudo-label + noise	3.95	9.72	0.64

Augmented pseudo-label generator ablation:

Pre-trained heartbeat	Conventional heartbeat	Noise signal	MAE↓
✓	—	—	4.56
✓	✓	—	8.75
—	✓	✓	14.48
✓	✓	✓	3.95

Key Findings¶

Noise matrix is critical: Using pseudo-labels alone yields MAE = 8.94; adding noise contrast reduces it to 4.40, a reduction of more than half. This validates the effectiveness of leveraging the intrinsic noise structure of radar range matrices.
Two stages are complementary: Stage 1 reduces MAE from 8.94 to 4.40; Stage 2 further reduces it to 3.95. Augmented pseudo-labels require all three signal types to function optimally.
Cross-dataset stability: The unsupervised method exhibits an MAE variation of only 0.4 bpm across datasets, whereas supervised methods vary by over 100%.
Skin-tone fairness: Performance disparity between light and dark skin tones (fairness metric) is substantially smaller for radar methods than for RGB methods; the unsupervised radar method achieves fairness on par with supervised radar methods.

Highlights & Insights¶

First unsupervised radar heartbeat sensing framework: Fills an important gap by simultaneously addressing the annotation bottleneck and preserving radar's privacy and robustness advantages.
Paradigm shift: noise as a resource: Conventionally regarded as something to be eliminated, noise is repurposed here as a source of negative samples for contrastive learning, turning a liability into an asset.
Two-stage progressive refinement: Pre-training with coarse pseudo-labels followed by fine-tuning with refined pseudo-labels constitutes a generalizable bootstrapping strategy.
Practical skin-tone fairness: Against the backdrop of growing attention to AI fairness, the unsupervised radar approach demonstrates more equitable sensing performance than RGB-based methods.
New dataset contribution: The RHB dataset (80 subjects) will be open-sourced, facilitating community research.

Limitations & Future Work¶

An MAE gap of approximately 1.8 bpm remains compared to the best supervised method; pseudo-label quality is the bottleneck.
Evaluation is limited to seated scenarios at 0.5–1 m distance; motion interference and longer-range settings are untested.
Only heart rate estimation is validated; extension to more complex physiological indicators such as respiratory rate or heart rate variability has not been explored.
The decision rules in Stage 2's augmented pseudo-label generator are heuristic-based and could be replaced by a more end-to-end approach.
No cross-modal comparison experiments with unsupervised video-domain methods are conducted.

Unsupervised rPPG from video (Gideon 2021, Sun 2022) provides the conceptual basis for applying contrastive learning to physiological signals, but their spatial-temporal positive/negative sample construction is not applicable to radar data.
Equipleth (Vilesov 2022) provides a radar-video multimodal dataset and supervised baselines.
Self-supervised radar methods (Song 2022, Zhang 2025) explore self-supervised learning for radar but still require annotated fine-tuning and thus do not constitute truly unsupervised approaches.

Rating¶

Novelty: ⭐⭐⭐⭐⭐ (first unsupervised radar heartbeat framework; noise contrastive idea is original)
Experimental Thoroughness: ⭐⭐⭐⭐ (two datasets, cross-dataset evaluation, comprehensive fairness analysis)
Writing Quality: ⭐⭐⭐⭐ (clear structure, well-motivated)
Value: ⭐⭐⭐⭐⭐ (addresses a practical pain point; new dataset and code are open-sourced)