📚 Pretraining¶

🤖 AAAI2026 · 6 paper notes

ELSPR: Evaluator LLM Training Data Self-Purification on Non-Transitive Preferences: ELSPR models pairwise preferences of LLM evaluators as tournament graphs, identifies non-transitive preferences via strongly connected components (SCCs), proposes a normalized directed graph structural entropy metric, and filters problematic training data through graph reconstruction — resulting in a 13.8% reduction in non-transitivity and a 0.088 decrease in structural entropy, while the discarded data achieves only 34.4% human agreement (vs. 52.6% for retained data).
Learning Procedural-aware Video Representations through State-Grounded Hierarchy Unfolding: This paper proposes a Task-Step-State (TSS) three-level semantic framework that introduces "state" as a visual grounding layer within the conventional task-step hierarchy, and designs a progressive pretraining strategy following a U-shaped path (Task→Step→State→Step→Task) to unfold the TSS hierarchy stage by stage. The approach achieves comprehensive state-of-the-art performance on task recognition, step recognition, and step forecasting tasks on the COIN and CrossTask datasets.
Learning Time in Static Classifiers: This paper proposes the Support-Exemplar-Query (SEQ) learning framework, which injects temporal reasoning capabilities into standard feed-forward classifiers through loss function design rather than architectural modification. By aligning predicted sequences with class-level temporal prototypes via soft DTW, the method achieves consistent improvements on both fine-grained image classification and video anomaly detection.
No-Regret Strategy Solving in Imperfect-Information Games via Pre-Trained Embedding: This paper proposes the Embedding CFR algorithm, which maps information sets in imperfect-information games to a continuous low-dimensional embedding space (rather than discrete clusters), achieving faster exploitability convergence and higher-quality strategy solving under the same space budget.
PrefixGPT: Prefix Adder Optimization by a Generative Pre-trained Transformer: PrefixGPT frames prefix adder optimization as a sequence generation problem. A customized GPT model is pretrained to learn design rules, then fine-tuned via RL to generate optimized designs, achieving state-of-the-art area-delay product (ADP) with robustness to initialization.
Uncovering Pretraining Code in LLMs: A Syntax-Aware Attribution Approach: This paper proposes SynPrune — the first syntax-aware membership inference attack (MIA) method for code. By identifying 47 Python syntactic conventions and pruning syntactically determined tokens (retaining only tokens that reflect authorial style) when computing MIA scores, SynPrune achieves an average AUROC improvement of 15.4%, enabling effective detection of pretraining data attribution in code LLMs.