ARK: Answer-Centric Retriever Tuning via KG-augmented Curriculum Learning¶
Conference: ACL 2026 arXiv: 2511.16326 Code: GitHub Area: Graph Learning Keywords: Answer-Centric Retrieval, Knowledge Graph Augmentation, Curriculum Learning, Contrastive Learning, Long-Context RAG
TL;DR¶
ARK filters positive samples through three-dimensional answer sufficiency scoring (Forward + Backward + Retriever alignment) and generates progressively difficult hard negatives via LLM-constructed knowledge graphs for curriculum contrastive learning, averaging +14.5% F1 across 10 datasets.
Background & Motivation¶
Key Challenge: The gap between retriever training objective (query-document similarity) and RAG's ultimate goal (generating correct answers).
Core Idea: Use KG subgraph-generated augmented queries to mine progressively difficult hard negatives through curriculum contrastive learning, teaching the retriever to distinguish "sufficient" from "seemingly relevant but insufficient" evidence.
Method¶
Key Designs¶
-
Three-Dimensional Answer Sufficiency Scoring: Forward alignment \(S_f\) = whether a chunk suffices to generate the answer; Backward alignment \(S_b\) = whether the question can be reconstructed from answer + chunk; Parameter alignment \(S_v\) = original retriever cosine similarity.
-
KG-Driven Hard Negative Mining: Large subgraph (\(Q_L^{aug}\)) generates easier negatives; small subgraph (\(Q_S^{aug}\)) generates harder negatives — more focused subgraphs produce queries closer to the correct answer's "semantic neighborhood."
-
Curriculum Contrastive Learning: Three-stage curriculum progressing from in-batch random negatives to hard negatives from \(Q_L^{aug}\) then \(Q_S^{aug}\).
Key Experimental Results¶
- Average +14.5% F1 across 10 datasets
- SOTA on 8/10 datasets (Ultradomain + LongBench)
Highlights & Insights¶
- Redefines KG's role in RAG — from "retrieval index" to "training signal generator" — drastically reducing KG usage cost
- Plug-and-play without changing retriever architecture
Rating¶
- Novelty: ⭐⭐⭐⭐
- Experimental Thoroughness: ⭐⭐⭐⭐⭐
- Writing Quality: ⭐⭐⭐⭐
- Value: ⭐⭐⭐⭐⭐