Drawing on literary narratology theory, this work defines three different levels of unreliable narrators (intra-narrational / inter-narrational / inter-textual), constructs an expert-annotated dataset TUNa, and systematically evaluates the performance of LLMs on the task of classifying unreliable narrators.
Problem Definition: In daily life, first-person narratives (reviews, social media, cover letters, etc.) are frequently encountered, and judging whether the narrator is reliable is a critical issue for secure information transmission. An unreliable narrator refers to a narrator who unintentionally misleads readers (distinct from intentional deception).
Limitations of Prior Work: Previously, no work utilized automated methods to analyze narrator unreliability, and no annotated datasets were available.
Narratological Foundation: The authors draw on Hansen's (2007) taxonomic framework of narratology, dividing unreliability into three levels that increase progressively from concrete to abstract:
(1) Intra-narrational: The narrator exhibits "verbal tics," such as hedging language, selective memory, admission of bias, and other textual cues.
(2) Inter-narrational: A second voice contradicts the narrator (e.g., refutations in others' dialogue), or the narrator exhibits consistent unreliability over past and present.
(3) Inter-textual: The narrator conforms to classic unreliable character archetypes—the naïf, the madman, the pícaro, or the clown.
Key Challenge: Unreliability cues are often subtle and context-dependent, scattered throughout the text, and sometimes require deep reasoning regarding the narrator's emotional and psychological state.
TUNa Dataset Construction: Collects first-person narrative texts from four domains: blogs (PersonaBank), Reddit posts (r/AITA), hotel reviews (Deceptive Opinion), and literary works (Project Gutenberg). Each sample is annotated by at least 2 experts majoring in English literature, achieving a Cohen's Kappa inter-annotator agreement of κ=0.71~0.75 (substantial agreement). Disagreements are resolved through discussion.
Curriculum Learning: Training samples are ranked by "degree of ambiguity"—LLMs first count the number of feature instances for each label type; samples with fewer candidate labels are considered easier. Parameter-efficient fine-tuning via LoRA is performed first on the easy subset, followed by the difficult subset, gradually improving the model's capability.
Transfer from Literature to Reality: Models are trained using training samples from the literary domain (Fiction) and tested out-of-domain on real-world domains such as blogs, Reddit, and reviews, verifying the transferability of unreliability knowledge learned from literature.
Standard classification cross-entropy loss is used, paired with LoRA adapters (8-bit quantization) for parameter-efficient fine-tuning, training for 3 epochs.
Increasing Task Difficulty: Intra-narrational is the easiest, while Inter-textual is the hardest (requiring more abstract reasoning).
Effectiveness of Curriculum Learning: CL significantly outperforms standard fine-tuning on smaller models, but the few-shot performance of large models (70B) is already comparable to CL.
Feasibility of Cross-Domain Transfer: Knowledge learned from Fiction can be reasonably transferred to real-world domains such as Blogs, Subreddits, and Reviews.
Gender Bias: Male narrators are correctly predicted at a higher rate than female narrators.
Impact of Narrative Style: Dialogue-heavy styles aid intra-narrational prediction, while descriptive styles aid inter-narrational and inter-textual prediction.
Innovatively combines literary theory (narratology) with NLP tasks, defining a completely new and meaningful task—automated identification of unreliable narrators.
The TUNa dataset covers multiple domains (literature/blogs/Reddit/reviews) with high-quality expert annotations (approx. 5 minutes of annotation time per sample).
Cleverly designed curriculum learning strategy: defines sample difficulty based on the "number of ambiguous candidate labels" rather than traditional loss magnitude.
Discovers that knowledge of unreliability learned from fictional works can transfer to real-world texts, opening a new direction of "learning real-world understanding from fiction."
Related to but fundamentally different from misinformation detection and deception detection: this work focuses on unintentional unreliability rather than intentional deception.
A natural extension of narrative AI directions such as character understanding and protagonist emotion analysis.
The curriculum learning strategy of "defining difficulty based on ambiguity" can be generalized to other NLP tasks containing ambiguous labels.