Skip to content

📐 Optimization & Theory

💬 ACL2026 · 1 paper notes

CLewR: Curriculum Learning with Restarts for Machine Translation Preference Learning

The paper proposes CLewR (Curriculum Learning with Restarts), a strategy that sorts data from easy to hard and restarts the curriculum at each epoch during preference optimization training, effectively mitigating catastrophic forgetting and consistently improving machine translation performance across multiple model families (Gemma2, Qwen2.5, Llama3.1) and preference optimization algorithms (DPO, CPO, ARPO).