Skip to content

🖼️ Image Restoration

💬 ACL2026 · 6 paper notes

CreditDecoding: Accelerating Parallel Decoding in Diffusion Large Language Models with Trace Credit

CreditDecoding is a training-free parallel decoding acceleration method that accumulates token-level historical evidence (trace credit) to boost correct but low-confidence tokens, achieving up to 5.48x speedup with +0.48 accuracy gain on LLaDA-8B-Instruct.

Diffusion-CAM: Faithful Visual Explanations for dMLLMs

Diffusion-CAM is the first interpretability method for diffusion-based multimodal LLMs (dMLLMs), extracting structurally valid intermediate representations from denoising trajectories with four post-processing modules (adaptive kernel denoising, distribution-aware confidence gating, contextual background decay, single-instance causal debiasing), significantly outperforming autoregressive CAM baselines on COCO Caption and GranDf.

Learning to Extract Rational Evidence via Reinforcement Learning for Retrieval-Augmented Generation

EviOmni learns to extract rational evidence from retrieved documents through a "reason-then-extract" paradigm: integrating evidence reasoning and extraction into a unified trajectory, using knowledge token masking to prevent information leakage, and optimizing via GRPO with verifiable rewards, achieving accuracy surpassing full-text retrieval at ~38x compression across 5 benchmarks.

Lost in Diffusion: Uncovering Hallucination Patterns and Failure Modes in Diffusion Large Language Models

This work presents the first systematic comparison of hallucination patterns between diffusion LLMs (dLLMs) and their autoregressive (AR) counterparts, revealing that current dLLMs exhibit higher hallucination tendency and identifying three diffusion-specific failure modes: premature termination, incomplete denoising, and context intrusion.

Purging the Gray Zone: Latent-Geometric Denoising for Precise Knowledge Boundary Awareness

GeoDe trains linear probes in LLM latent space to construct truth hyperplanes, using sample-to-hyperplane geometric distance as confidence signals to filter high-quality abstention fine-tuning data, effectively eliminating "gray zone" noise near decision boundaries and significantly improving model truthfulness and reliability.

Understanding and Mitigating Spurious Signal Amplification in Test-Time Reinforcement Learning for Math Reasoning

This paper systematically analyzes sources and amplification mechanisms of spurious signals in test-time RL (TTRL) — mid-frequency answers constitute the ambiguous zone as the primary noise source, while GRPO's within-group normalization amplifies these spurious signals — and proposes DDRL with balanced sampling, fixed advantage values, and consensus offline refinement, achieving 15.3% relative improvement on Qwen2.5-Math-1.5B.