Controlling Politeness in Multi-Turn Dialogues Through Pre-Phrase Augmentation¶

Conference: ACL 2025
Area: Text Generation
Keywords: Politeness Control, Multi-Turn Dialogue, Pre-Phrase Augmentation, Style Control, Dialogue Generation

TL;DR¶

This paper proposes a method based on Pre-Phrase Augmentation, which automatically adds politeness-regulating prefixes during the dialogue generation process to achieve fine-grained politeness control in multi-turn dialogues while maintaining the coherence and informational integrity of the dialogue content.

Background & Motivation¶

Background: The politeness of dialogue systems is a crucial factor affecting user experience, especially in scenarios such as customer service, medical consultation, and education. Current large language models often lack precise control over the level of politeness in their generated dialogues—either being too blunt and direct or overly courteous, making it difficult to adjust flexibly according to situational needs.

Limitations of Prior Work: Existing style control methods mainly include: (1) Conditional generation—controlled by control tokens or conditioning vectors, but difficult to maintain a consistent politeness level across multi-turn dialogues; (2) Post-processing rewriting—generating first and then rewriting into the target style, which is prone to information loss or introducing incoherence; (3) RLHF/Fine-tuning—effective but expensive and lacking fine-grained control (only able to choose between "polite" and "impolite", with no dynamic adjustment of degrees). The core problem is that in multi-turn scenarios, politeness needs to be dynamically adjusted as the dialogue progresses (e.g., being more polite when user emotions change), whereas existing methods lack this dynamic capability.

Key Challenge: Politeness control must simultaneously satisfy two objectives: (1) accurately conveying the original informational content; (2) conforming to the target politeness level. These two objectives can conflict in certain situations—overly polite expressions might dilute key information, while excessively direct expressions may be perceived as impolite.

Goal: To design a lightweight politeness control method capable of achieving continuously adjustable politeness control in multi-turn dialogues without compromising the quality and information content of the generated responses.

Key Insight: The authors observe that when humans adjust their level of politeness, they typically achieve this by adding prefix phrases (e.g., "Excuse me", "If it's convenient", "Could you please") rather than altering the core content. Inspired by this, the authors propose controlling politeness by learning and inserting appropriate prefix phrases.

Core Idea: Train a prefix phrase generator that produces appropriate polite prefixes based on the target politeness level and dialogue context. This decouples politeness control from content generation, achieving politeness-controllable dialogue generation without affecting the informational content.

Method¶

Overall Architecture¶

The system is divided into two decoupled modules: (1) a content generator, responsible for generating information-complete response content based on the dialogue context; (2) a prefix phrase generator, which produces appropriate prefix phrases and tone adjustments based on the target politeness level and the current response content. The final response is a concatenation of the prefix phrase and the content, potentially including minor tonal adjustments to certain expressions within the content.

Key Designs¶

Politeness Quantization and Prefix Phrase Database:
- Function: Transforming politeness from a subjective concept into an operational numerical scale.
- Mechanism: Defining 5 politeness levels (from highly direct to highly polite) and obtaining politeness level labels through crowd-sourced annotation of large-scale dialogue data. Simultaneously, a prefix phrase database organized by politeness levels is constructed, containing categories of prefix phrases such as greetings, softeners, apologies, and requests, with each phrase annotated with its corresponding politeness level and applicable scenarios.
- Design Motivation: Operationalizing the vague concept of "politeness" is a prerequisite for control. The graded design provides a continuously adjustable control knob, and the prefix database provides training signals for the generator.
Context-Aware Prefix Generator:
- Function: Generating appropriate prefix phrases based on the dialogue context and the target politeness level.
- Mechanism: Based on a small Transformer model (such as T5-small), the input consists of the dialogue history + current response content + target politeness level token, and the output is the prefix phrase sequence. The model is trained on the prefix phrase database to learn which type of prefix to use under a given context. During generation, the politeness level of the prefix is jointly controlled via temperature and the level token.
- Design Motivation: Prefixes cannot be mechanically selected from a database; they must be dynamically generated based on dialogue content and emotional states. Context-awareness ensures semantic coherence between the prefix and the dialogue content.
Multi-Turn Consistency Constraint:
- Function: Ensuring a smooth transition of politeness levels across multi-turn dialogues.
- Mechanism: Introducing a politeness level tracker that records the politeness level of historical turns and calculates the change in politeness levels between adjacent turns. If there is no external signal (e.g., changes in user emotion), a regularization constraint restricts the politeness level change between adjacent turns to not exceed 1 level. When user emotion changes are detected (e.g., dissatisfaction, urgency), the target politeness level is automatically escalated.
- Design Motivation: Abrupt changes in tone can feel unnatural to users. Consistency constraints simulate the behavior of human customer service representatives maintaining a stable tone throughout a conversation.

Loss & Training¶

The training objectives of the prefix generator include: (1) prefix-content pairing loss (via human-annotated paired data); (2) politeness level matching loss (the gap between the level of the generated prefix evaluated by a politeness classifier and the target level); (3) semantic coherence loss (the semantic similarity between the prefix and the response content). The overall loss is \(L = L_{gen} + \gamma_1 L_{polite} + \gamma_2 L_{coherence}\).

Key Experimental Results¶

Main Results¶

Dataset	Metric	Ours	Control Token	Style Transfer	Direct Prompting
DailyDialog	Politeness Match Rate↑	87.3%	72.1%	78.5%	69.4%
DailyDialog	BLEU↑	18.6	17.2	14.8	18.1
ConvAI2	Politeness Match Rate↑	85.1%	68.9%	75.3%	66.7%
ConvAI2	Information Retention Rate↑	94.2%	91.5%	82.7%	93.8%
Customer Service Dialogue	User Satisfaction↑	4.21/5	3.72/5	3.85/5	3.64/5

Ablation Study¶

Configuration	Politeness Match Rate	BLEU	Information Retention Rate	Description
Full model	87.3%	18.6	94.2%	Full method
w/o Context-Awareness	79.6%	17.4	92.1%	Random prefix selection, drops 7.7%
w/o Multi-Turn Consistency	83.1%	18.4	93.8%	Without consistency constraints
w/o Prefix Decoupling	80.4%	16.3	87.5%	End-to-end style control
Fixed Prefix Templates	76.8%	18.3	93.9%	Using fixed templates instead of generation

Key Findings¶

The prefix decoupling design makes the largest contribution—improving the information retention rate from 87.5% to 94.2%, which proves that decoupling politeness control from content generation effectively preserves information integrity.
Context-awareness is key to differentiating this from simple templates, improving the match rate by 10.5 percentage points.
In real customer service scenarios, dynamic politeness adjustment achieved the highest user satisfaction (4.21/5).

Highlights & Insights¶

The idea of pre-phrase augmentation draws from human linguistic politeness strategies—this is a pragmatically grounded design that is more natural than end-to-end style transfer.
Decoupled design is a key innovation—separating "how to say" (polite prefixes) from "what to say" (content), avoiding interference of style control over information.
The method can be generalized to other dialogue style control tasks, such as formality control and emotional expression intensity adjustment.

Limitations & Future Work¶

The pre-phrase method is suitable for appending politeness markers but has limited capability in scenarios requiring whole-sentence style rewriting.
Perception of politeness varies across cultural backgrounds, and the current grading system may lack cross-cultural universality.
The granularity of 5 levels might still be coarse in certain scenarios, making finer-grained continuous control worth exploring.
The impact of non-textual factors (such as emojis and punctuation usage) on politeness perception has not been considered yet.

vs CTRL (Keskar et al.): CTRL uses control codes for conditional generation, while ours uses prefix augmentation; the advantage lies in not affecting the generation of core content.
vs Politeness Transfer (Madaan et al.): Style transfer methods rewrite the entire sentence, causing significant information loss; ours only adds a prefix, resulting in an 11.5% higher information retention rate.
vs RL-based style control: RL-based methods are computationally expensive and unstable; our lightweight approach only requires training a small prefix generator.
vs Prompt-based style control: LLM prompting is flexible but fails to precisely control continuous changes in politeness levels, whereas our graded design offers greater controllability.

Rating¶

Novelty: ⭐⭐⭐⭐ The prefix augmentation approach is simple and aligns with linguistic intuition.
Experimental Thoroughness: ⭐⭐⭐⭐ Both automated metrics and human evaluation are conducted, including real-world scenario validation.
Writing Quality: ⭐⭐⭐⭐ The motivation is clear, and the method description is easy to understand.
Value: ⭐⭐⭐⭐ Holds practical value for style control in dialogue systems.