Radar-APLANC: Unsupervised Radar-based Heartbeat Sensing via Augmented Pseudo-Label and Noise Contrast¶
Conference: AAAI 2026 arXiv: 2511.08071 Code: https://github.com/RadarHRSensing/Radar-APLANC Area: Other Keywords: radar heartbeat sensing, unsupervised learning, pseudo-label, noise contrastive learning, FMCW radar
TL;DR¶
This paper proposes Radar-APLANC, the first unsupervised learning framework for radar-based heartbeat sensing. Through a noise contrastive triplet (NCT) loss and an augmented pseudo-label generator, it achieves two-stage unsupervised training without requiring expensive physiological signal annotations, attaining performance approaching supervised methods.
Background & Motivation¶
Value and Challenges of Radar-based Heartbeat Sensing¶
FMCW radar enables contactless heartbeat sensing by detecting sub-millimeter (0.1–0.5 mm) chest wall displacements, offering unique advantages in privacy preservation, environmental robustness, and continuous monitoring. However:
Noise sensitivity of traditional methods: Phase extraction and unwrapping-based approaches suffer severe performance degradation under motion artifacts, multipath interference, and low SNR conditions, with phase wrapping ambiguity and noise sensitivity being fundamental limitations.
Annotation bottleneck for supervised methods: Deep learning methods (e.g., Equipleth RF, VitaNet, CardiacMamba) offer better noise robustness but require large-scale, high-quality physiological signal annotations (e.g., synchronized PPG signals), whose collection cost is prohibitive and limits training data scalability.
Non-transferability of video-domain methods: Unsupervised physiological monitoring methods from the video domain (e.g., contrastive learning paradigms) exist, but radar heartbeat signals have lower SNR, a different representation (chest wall motion vs. facial color changes), and conventional positive/negative sample construction strategies fail under strong noise.
Core Insight¶
Radar range matrices inherently contain a contrastive structure between "heartbeat range bins" and "noise range bins"—this intrinsic signal-noise separation can be exploited to construct positive and negative samples without external annotations.
Method¶
Overall Architecture¶
Radar-APLANC is a two-stage unsupervised framework: - Stage 1: Pseudo-labels are generated using conventional radar methods; pre-training is performed with the noise contrastive triplet (NCT) loss. - Stage 2: An augmented pseudo-label generator is introduced to improve pseudo-label quality via quality assessment and adaptive noise-aware label selection, enabling further fine-tuning.
Preliminaries¶
Range matrix acquisition: An FMCW radar transmits a linearly frequency-modulated signal \(s(t)\); the received reflected signal \(u(t)\) is IQ-demodulated to obtain the intermediate frequency signal \(m(t)\). FFT is applied to each chirp's IF signal to yield the range profile \(M_n[f]\), and \(N\) chirps are concatenated to form the range matrix \(M \in \mathbb{R}^{N \times D}\).
Basic heartbeat sensing: (1) Select the range bin \(d^*\) with the maximum power ratio (human body location); (2) compute the phase signal of that bin; (3) apply phase unwrapping; (4) apply 0.8–3.0 Hz bandpass filtering to obtain the heartbeat signal \(\Phi(\cdot) \in \mathbb{R}^N\).
Key Designs¶
1. Noise Contrastive Triplet Loss (NCT Loss) — Stage 1¶
Mechanism: The natural contrast between heartbeat bins and noise bins in radar range matrices is exploited to construct self-supervised learning signals.
- Pseudo-label generation: Conventional radar methods extract heartbeat signals from the central range bin \(d^*\); random temporal sampling and power spectral density (PSD) transformation yield the pseudo-label set \(S_{PL}\).
- Positive sample construction: Heartbeat matrices \(M(\cdot, d^* \pm \Delta d)\) from bins within a window around the central bin are fed into a heartbeat extractor, producing predicted heartbeat signals \(p(\cdot)\); PSD transformation yields positive sample set \(S_P\).
- Negative sample construction: Windows from randomly selected non-central bins \(d'\) are treated as noise matrices and fed into a noise extractor to produce noise signals \(q(\cdot)\); PSD transformation yields negative sample set \(S_N\).
NCT Loss: $\(\mathcal{L}_{NCT} = \underbrace{\frac{1}{K^2}\sum_{i,j}\|S_{PL}[i] - S_P[j]\|^2}_{\text{positive term: pull heartbeat toward pseudo-label}} + \underbrace{(-\frac{1}{K^2}\sum_{i,j}\|S_P[i] - S_N[j]\|^2)}_{\text{negative term: push heartbeat away from noise}}\)$
Design Motivation: All traditional signal processing methods can be regarded as a form of pseudo-label generator; despite being noisy, they still carry heartbeat information. Range bins at non-body positions predominantly contain background noise, naturally serving as negative samples. PSD transformation enables more stable frequency-domain comparison.
2. Augmented Pseudo-Label Generator — Stage 2¶
Mechanism: The Stage 1 pre-trained model is used to assess and select higher-quality pseudo-labels.
The module consists of two sub-components:
Quality assessment module: - Conventional heartbeat signals \(\{\Phi_1, \ldots, \Phi_{2\Delta d+1}\}\) are extracted from each of the \(2\Delta d + 1\) range bins within the heartbeat window. - Noise distance \(X_i = D(\Phi_i, q)\): distance from the pre-trained noise signal; larger values indicate better signal quality. - Heartbeat distance \(Y_i = D(\Phi_i, p)\): distance from the pre-trained heartbeat signal; smaller values indicate better signal quality. - The distance metric is the mean absolute error between the estimated heart rates of two signals.
Decision module (adaptive noise-aware label selection): - Ideal case: \(\arg\max_i X_i = \arg\min_i Y_i\) (maximum noise distance coincides with minimum heartbeat distance); the corresponding signal is directly selected. - Conflict case: \(\arg\max_i X_i \neq \arg\min_i Y_i\); the noise distance of the signal \(\Phi_{\arg\min Y_i}\) (minimum heartbeat distance) is compared against the noise distance of the pre-trained heartbeat signal \(D(p, q)\). - If greater: select \(\Phi_{\arg\min Y_i}\) (conventional method is preferred). - Otherwise: select the pre-trained heartbeat signal \(p\) (deep learning method is preferred).
Loss & Training¶
- Both Stage 1 and Stage 2 use the NCT Loss; the only difference is the source of pseudo-labels.
- Both stages train the heartbeat extractor and the noise extractor.
- Optimizer: AdamW, learning rate 1e-4, trained for 200 epochs.
- Evaluation uses 10-second windows.
Key Experimental Results¶
Main Results¶
Two datasets are used: Equipleth (public, 91 subjects, 550 recordings) and RHB (self-collected, 80 subjects, 240 recordings).
| Method | Type | Equipleth MAE↓ | Equipleth r↑ | RHB MAE↓ | RHB r↑ |
|---|---|---|---|---|---|
| FFT-based RF | Traditional | 13.51 | 0.24 | 12.25 | 0.26 |
| Equipleth RF | Supervised | 2.18 | 0.89 | 3.19 | 0.82 |
| VitaNet | Supervised | 3.14 | 0.77 | 5.28 | 0.66 |
| mmFormer | Supervised | 6.50 | 0.52 | 8.89 | 0.28 |
| Radar-APLANC | Unsupervised | 3.95 | 0.64 | 3.92 | 0.77 |
Cross-dataset evaluation (generalization):
| Method | RHB→Equipleth MAE | Equipleth→RHB MAE |
|---|---|---|
| Equipleth RF | 4.53 (+107.8%) | 2.68 |
| VitaNet | 7.43 (+136.6%) | 2.38 |
| Radar-APLANC | 4.10 (+3.8%) | 3.52 |
Ablation Study¶
| Stage 1 Config | Stage 2 Config | MAE↓ | RMSE↓ | r↑ |
|---|---|---|---|---|
| Noise matrix only | — | 34.48 | 38.34 | 0.01 |
| Pseudo-label only | — | 8.94 | 15.88 | 0.30 |
| Noise + pseudo-label | — | 4.40 | 9.89 | 0.63 |
| Noise + pseudo-label | Augmented pseudo-label only | 7.42 | 13.61 | 0.38 |
| Noise + pseudo-label | Augmented pseudo-label + noise | 3.95 | 9.72 | 0.64 |
Augmented pseudo-label generator ablation:
| Pre-trained heartbeat | Conventional heartbeat | Noise signal | MAE↓ |
|---|---|---|---|
| ✓ | — | — | 4.56 |
| ✓ | ✓ | — | 8.75 |
| — | ✓ | ✓ | 14.48 |
| ✓ | ✓ | ✓ | 3.95 |
Key Findings¶
- Noise matrix is critical: Using pseudo-labels alone yields MAE = 8.94; adding noise contrast reduces it to 4.40, a reduction of more than half. This validates the effectiveness of leveraging the intrinsic noise structure of radar range matrices.
- Two stages are complementary: Stage 1 reduces MAE from 8.94 to 4.40; Stage 2 further reduces it to 3.95. Augmented pseudo-labels require all three signal types to function optimally.
- Cross-dataset stability: The unsupervised method exhibits an MAE variation of only 0.4 bpm across datasets, whereas supervised methods vary by over 100%.
- Skin-tone fairness: Performance disparity between light and dark skin tones (fairness metric) is substantially smaller for radar methods than for RGB methods; the unsupervised radar method achieves fairness on par with supervised radar methods.
Highlights & Insights¶
- First unsupervised radar heartbeat sensing framework: Fills an important gap by simultaneously addressing the annotation bottleneck and preserving radar's privacy and robustness advantages.
- Paradigm shift: noise as a resource: Conventionally regarded as something to be eliminated, noise is repurposed here as a source of negative samples for contrastive learning, turning a liability into an asset.
- Two-stage progressive refinement: Pre-training with coarse pseudo-labels followed by fine-tuning with refined pseudo-labels constitutes a generalizable bootstrapping strategy.
- Practical skin-tone fairness: Against the backdrop of growing attention to AI fairness, the unsupervised radar approach demonstrates more equitable sensing performance than RGB-based methods.
- New dataset contribution: The RHB dataset (80 subjects) will be open-sourced, facilitating community research.
Limitations & Future Work¶
- An MAE gap of approximately 1.8 bpm remains compared to the best supervised method; pseudo-label quality is the bottleneck.
- Evaluation is limited to seated scenarios at 0.5–1 m distance; motion interference and longer-range settings are untested.
- Only heart rate estimation is validated; extension to more complex physiological indicators such as respiratory rate or heart rate variability has not been explored.
- The decision rules in Stage 2's augmented pseudo-label generator are heuristic-based and could be replaced by a more end-to-end approach.
- No cross-modal comparison experiments with unsupervised video-domain methods are conducted.
Related Work & Insights¶
- Unsupervised rPPG from video (Gideon 2021, Sun 2022) provides the conceptual basis for applying contrastive learning to physiological signals, but their spatial-temporal positive/negative sample construction is not applicable to radar data.
- Equipleth (Vilesov 2022) provides a radar-video multimodal dataset and supervised baselines.
- Self-supervised radar methods (Song 2022, Zhang 2025) explore self-supervised learning for radar but still require annotated fine-tuning and thus do not constitute truly unsupervised approaches.
Rating¶
- Novelty: ⭐⭐⭐⭐⭐ (first unsupervised radar heartbeat framework; noise contrastive idea is original)
- Experimental Thoroughness: ⭐⭐⭐⭐ (two datasets, cross-dataset evaluation, comprehensive fairness analysis)
- Writing Quality: ⭐⭐⭐⭐ (clear structure, well-motivated)
- Value: ⭐⭐⭐⭐⭐ (addresses a practical pain point; new dataset and code are open-sourced)