Skip to content

🔍 Information Retrieval & RAG

🎞️ ECCV2024 · 3 paper notes

📌 Same area in other venues: 🔬 ICLR2026 (81) · 💬 ACL2026 (73) · 🧪 ICML2026 (26) · 🤖 AAAI2026 (21) · 🧠 NeurIPS2025 (25) · 📹 ICCV2025 (5)

Multi-Label Cluster Discrimination for Visual Representation Learning

This work proposes MLCD (Multi-Label Cluster Discrimination), which assigns multiple cluster pseudo-labels to each image and designs a disambiguated multi-label classification loss. Pre-trained on LAION-400M, the ViT model under MLCD comprehensively outperforms OpenCLIP, FLIP, and UNICOM in linear probe, zero-shot classification, and retrieval tasks.

OneRestore: A Universal Restoration Framework for Composite Degradation

OneRestore is proposed as a Transformer-based universal image restoration framework. Driven by a scene-descriptor-guided cross-attention mechanism and a composite degradation restoration loss, it adaptively handles low-light, haze, rain, snow, and their arbitrary composite combinations within a single model, supporting controllable restoration under both text and visual modes.

Towards Open-Ended Visual Recognition with Large Language Model

This paper proposes the OmniScient Model (OSM)—a generative mask classifier based on a frozen CLIP-ViT, a trainable MaskQ-Former, and a frozen LLM (Vicuna-7B). It shifts visual recognition from "selecting categories from a predefined vocabulary" to "directly generating category names," eliminating the dependency on predefined vocabularies during both training and testing. It outperforms DaTaSeg by +4.3 PQ on COCO panoptic segmentation.