🎁 Recommender Systems¶

🔬 ICLR2026 · 10 paper notes

C2AL: Cohort-Contrastive Auxiliary Learning for Large-scale Recommendation Systems: This paper proposes C2AL (Cohort-Contrastive Auxiliary Learning), which data-drivenly identifies user cohort pairs with maximal distributional divergence and constructs contrastive auxiliary binary classification tasks to regularize the shared encoder. This transforms FM attention weights from sparse to dense, mitigating representation bias for minority cohorts in large-scale recommendation systems. The approach is validated on 6 Meta production models with billions of data points.
CollectiveKV: Decoupling and Sharing Collaborative Information in Sequential Recommendation: By observing significant cross-user similarity (collaborative signals) in KV caches across different users in sequential recommendation, this paper proposes CollectiveKV, which decomposes KV into a low-dimensional user-specific component and a high-dimensional shared component retrieved from a global KV pool, achieving a compression ratio of 0.8% with no performance degradation.
From Evaluation to Defense: Advancing Safety in Video Large Language Models: This work constructs VideoSafetyEval (11.4k video-query pairs covering 19 risk categories), revealing that the video modality degrades safety performance by 34.2%, and proposes VideoSafety-R1, a three-stage framework (Alarm Token + SFT + Safety-guided GRPO) that improves defense success rate by 71.1% on VSE-HH.
GoalRank: Group-Relative Optimization for a Large Ranking Model: This paper theoretically proves that for any Multi-Generator-Evaluator (Multi-G-E) ranking system, there exists a larger generator-only model that approximates the optimal policy with smaller error and satisfies scaling laws. Based on this, GoalRank is proposed—a framework that uses a reward model to construct a group-relative reference policy for training a large generator-only ranking model, achieving significant improvements over SOTA in online A/B testing.
In Agents We Trust, but Who Do Agents Trust? Latent Source Preferences Steer LLM Generations: Through large-scale controlled experiments across 12 LLMs from 6 providers spanning three domains—news, academia, and e-commerce—this paper reveals that LLMs exhibit systematic latent source preferences: when content is semantically identical, merely swapping source labels significantly alters model selection behavior, and this preference cannot be eliminated through prompt engineering.
ProPerSim: Developing Proactive and Personalized AI Assistants through User-Assistant Simulation: This paper proposes ProPerSim, a simulation framework that models daily behaviors of 32 user personas grounded in the Big Five personality model within the Smallville household environment. The AI assistant makes proactive recommendation decisions every 2.5 minutes and learns user preferences via DPO, improving user satisfaction from 2.2/4 to 3.3/4 over a 14-day simulation—providing the first empirical validation of jointly achieving proactivity and personalization.
RAE: A Neural Network Dimensionality Reduction Method for Nearest Neighbors Preservation in Vector Search: This paper proposes RAE (Regularized Auto-Encoder), a dimensionality reduction method based on a linear autoencoder with Frobenius norm regularization. The authors theoretically prove that the regularization coefficient \(\lambda\) constrains the condition number \(\kappa(W)\) of the encoder matrix via the Rayleigh quotient property, thereby bounding the norm distortion rate and preserving k-NN structure. RAE consistently outperforms PCA, UMAP, MDS, and ISOMAP on four datasets, achieving at least 12% higher k-NN preservation accuracy than PCA under cosine distance, with training requiring only 8 seconds and inference at millisecond latency.
Rejuvenating Cross-Entropy Loss in Knowledge Distillation for Recommender Systems: This paper theoretically demonstrates that CE loss maximizes a lower bound of NDCG in recommender system KD only when a closure assumption is satisfied—the candidate subset must contain the student's top-ranked items. However, the actual KD objective is to distill the ranking of the teacher's top items, and these two requirements conflict, explaining why vanilla CE performs poorly. Accordingly, the paper proposes RCE-KD: the teacher's top-K items are split into two groups based on whether they appear in the student's top-K, handled respectively by exact CE and sampling-approximated closure CE, with an adaptive fusion weight that evolves dynamically throughout training.
Search Arena: Analyzing Search-Augmented LLMs: This paper presents Search Arena — the first large-scale human preference dataset for search-augmented LLMs (24,069 conversations + 12,652 preference votes, 71 languages). Key findings include: user preference is positively influenced by citation quantity even when citations do not support the claims; community-driven platforms are preferred over Wikipedia; search augmentation does not degrade general chat performance, whereas general-purpose LLMs degrade significantly in search scenarios.
Token-Efficient Item Representation via Images for LLM Recommender Systems: This paper proposes I-LLMRec, which leverages item images in place of verbose textual descriptions to represent item semantics in recommender systems. Through a Recommendation-oriented Image-Semantic Alignment (RISA) module and a Recommendation-oriented Embedding Retrieval Inference (RERI) module, the method represents each item with a single token while preserving rich semantics, achieving approximately 2.93× inference speedup and surpassing text-description-based methods in recommendation performance.