🔄 Self-Supervised Learning¶

💬 ACL2025 · 7 paper notes

📌 Same area in other venues: 📷 CVPR2026 (92) · 🔬 ICLR2026 (81) · 💬 ACL2026 (1) · 🧪 ICML2026 (28) · 🤖 AAAI2026 (16) · 🧠 NeurIPS2025 (35)

🔥 Top topics: Self-Supervised Learning ×4

AnalyticKWS: Towards Exemplar-Free Analytic Class Incremental Learning for Small-footprint Keyword Spotting: AnalyticKWS is proposed, an exemplar-free incremental learning method for keyword spotting. By freezing the feature extractor and analytically updating the classifier via recursive least squares, it outperforms all rehearsal-based methods on the GSC and SC-100 datasets with extremely low training time and memory overhead.
Improving Low-Resource Morphological Inflection via Self-Supervised Objectives: This paper systematically explores the effectiveness of 13 self-supervised auxiliary objectives (Autoencoding, CMLM, T5-style, etc.) in extremely low-resource morphological inflection tasks. It finds that autoencoding is optimal when unlabeled data is extremely scarce, whereas character-level MLM is better when data increases. Mask sampling based on morpheme boundaries represents the most promising direction.
Contrastive Learning on LLM Back Generation Treebank for Cross-domain Constituency Parsing: This paper proposes an LLM Back Generation method that takes incomplete cross-domain constituency trees as input, prompting the LLM to complete the missing words to generate a treebank. It also designs a span-level contrastive learning pre-training strategy to achieve state-of-the-art performance in cross-domain constituency parsing.
Magnet: Augmenting Generative Decoders with Representation Learning and Infilling Capabilities: This paper proposes Magnet, a method that augments decoder-only LLMs simultaneously into text encoders and infilling models using a hybrid attention mechanism (bidirectional + causal) and three self-supervised objectives (masked prediction + contrastive learning + missing span generation). It outperforms specialized methods like LLM2Vec on token-level and sentence-level representation learning tasks while avoiding the severe text repetition issue caused by bidirectionality.
QAEncoder: Towards Aligned Representation Learning in Question Answering Systems: Proposes QAEncoder, a training-free method that estimates the expected embedding of queries corresponding to a document as a proxy for the document representation, combined with a document fingerprint to maintain discriminability. This improves bge-large from 58.5 to 61.8 NDCG@10 on BEIR with zero additional storage or latency overhead.
SHuBERT: Self-Supervised Sign Language Representation Learning via Multi-Stream Cluster Prediction: Proposes SHuBERT (Sign Hidden-Unit BERT), migrating the masked cluster prediction paradigm of the speech self-supervised learning model HuBERT to sign language video. By clustering hand, face, and body pose streams separately and simultaneously predicting the cluster labels of masked frames, the model is pre-trained on approximately 984 hours of ASL video, achieving state-of-the-art (SOTA) on public benchmarks across translation, isolated recognition, and fingerspelling detection tasks.
WhiSPA: Semantically and Psychologically Aligned Whisper with Self-Supervised Contrastive and Student-Teacher Learning: Proposes WhiSPA, which aligns the latent space of the Whisper audio encoder with SBERT semantic representations and psychological dimensions (emotion, personality) through contrastive learning, eliminating the dependency on an additional text LM in speech processing and reducing error by 73-84% on psychological evaluation tasks.