Multi-perspective Alignment for Increasing Naturalness in Neural Machine Translation¶

Conference: ACL 2025
arXiv: 2412.08473
Code: GitHub
Area: Multilingual Translation
Keywords: Machine Translation, Naturalness, Translationese Elimination, Reinforcement Learning, Multi-perspective Alignment

TL;DR¶

A multi-perspective alignment framework (Multi-perspective Alignment) is proposed to simultaneously reward translation naturalness and content preservation. By fine-tuning NMT models using reinforcement learning with joint reward signals from a translationese classifier and COMET, the model generates vocabulary-rich translations without sacrificing translation accuracy.

Background & Motivation¶

Background: Although the translation quality of NMT systems has significantly improved, the translated output still exhibits prominent "machine translationese" characteristics compared to human translations, such as reduced vocabulary diversity and increased source-language interference. These characteristics lead to inflated performance evaluation in evaluation datasets and undermine the reading experience in literary translation.

Limitations of Prior Work: - Tagging methods: Utilizing tags to distinguish original and translated training data is effective but rigid, lacking the flexibility to adjust the level of naturalness. - Reranking methods: Enhancing diversity by reranking translation candidates, which, however, leads to a significant drop in translation accuracy. - APE methods: Automatic Post-Editing transforms MT outputs into more natural texts, but at the cost of translation quality. - Core difficulty: There is an inherent contradiction between improving naturalness (vocabulary diversity) and maintaining content adequacy.

Key Challenge: Translated texts must simultaneously match the style of original texts in the target language (naturalness) while preserving the content of the source language (adequacy), two objectives that are often in conflict.

Goal: To design a flexible optimization method for translation models that enhances output naturalness without sacrificing translation accuracy.

Key Insight: Drawing inspiration from RLHF frameworks and text style transfer paradigms, naturalness is equated with style and adequacy with content, allowing the definition of reward functions from multiple perspectives for reinforcement learning alignment.

Core Idea: Utilizing a translationese classifier as the naturalness reward and COMET as the content reward, combining the two via harmonic mean to optimize the MT model using policy gradient.

Method¶

Overall Architecture¶

Train the base MT model using supervised learning.
Train binary translationese detectors (from three perspectives).
Reinforcement learning stage: fine-tune the MT model using the joint naturalness and content reward.

Key Designs¶

1. Base MT Model¶

Based on the BART architecture (6-layer Encoder + 6-layer Decoder), the model minimizes negative log-likelihood on an English-Dutch literary parallel corpus (4.87 million sentence pairs from 495 books):

\[\mathcal{L}_{nl} = -\frac{1}{m}\sum_{i=1}^{m}\log(p(y_i|y_{0:i-1}, x; \theta))\]

2. Translationese Classifiers (Three Perspectives)¶

Fine-tune three binary classifiers using BERTje (Dutch BERT): - HT-OR: Distinguishing human translation (HT) vs. original text (OR) \(\rightarrow\) aiming to eliminate human translationese. - MT-HT: Distinguishing machine translation (MT) vs. human translation (HT) \(\rightarrow\) aiming to bring MT closer to HT. - MT-OR: Distinguishing machine translation (MT) vs. original text (OR) \(\rightarrow\) aiming to bring MT closer to OR.

Training data: OR and HT are obtained from a corpus of 286 Dutch books (approximately 980k sentences each), and MT data is generated by translating with the base model.

Classification performance ranking by difficulty: MT-OR (easiest) > MT-HT > HT-OR (hardest).

3. Multi-perspective Alignment Framework¶

Naturalness Reward (output of the translationese classifier):

\[r_t(\hat{y}) = \begin{cases} 0 & \text{if } p(t_1|\hat{y}; \phi) < \sigma_t \\ p(t_1|\hat{y}; \phi) & \text{otherwise} \end{cases}\]

where \(\sigma_t = 0.5\), and \(t_1\) denotes the target class (e.g., OR or HT).

Content Reward (COMET score):

\[r_c(\hat{y}) = \begin{cases} 0 & \text{if } C(x, y, \hat{y}) < \sigma_c \\ C(x, y, \hat{y}) & \text{otherwise} \end{cases}\]

where \(\sigma_c = 0.85\).

Comprehensive Reward (harmonic mean):

\[r(\hat{y}) = \begin{cases} 0 & \text{if } r_t = 0 \text{ or } r_c = 0 \\ \frac{2}{1/r_t + 1/r_c} & \text{otherwise} \end{cases}\]

Loss & Training¶

The final objective function combines the supervised loss and the reward loss:

\[\mathcal{L}(\theta; \mathcal{D}) = \mathbb{E}_{(x,y)\sim\mathcal{D}}[\beta \mathcal{L}_{nl} + \mathcal{L}_{rw}]\]

where \(\beta = 0.5\), and \(\mathcal{L}_{rw} = -\frac{1}{m}\sum_{i=1}^{m} r(\hat{y}) \log(p(\hat{y}_i|\hat{y}_{0:i-1}, x; \theta))\)

Key design: Using NLL loss to constrain the model from deviating too far from the base MT, replacing the computationally intensive reference model KL divergence penalty in traditional RLHF.

Inference uses beam search (size=5), and the aligned model is trained for 5k steps.

Key Experimental Results¶

Main Results¶

System	BLEU	MetricX ↓	KIWI	MTLD	MT-HT Acc.
Human Translation	—	—	—	96.0	69.3
Base MT Model	32.5	2.66	80.4	90.4	18.9
Tailored RR (Top-k)	21.2	4.86	72.4	104.3	52.9
APE	29.9	3.38	77.9	91.7	33.6
Tagging (4.8M)	31.1	3.05	79.7	96.8	43.2
BM + COMET & MT-HT	32.1	2.63	80.6	93.3	33.4

The best model (COMET & MT-HT) improves MTLD from 90.4 to 93.3 (closer to human translation's 96.0), MetricX decreases from 2.66 to 2.63 (lower is better), and KIWI increases from 80.4 to 80.6, where neither of the latter two metrics was used during reward training.

Ablation Study — Reward Components¶

Setting	MetricX ↓	MTLD	MT-HT Acc.
BM (no reward)	2.66	90.4	18.9
BM + COMET only	2.64	90.9	19.1
BM + MT-HT only	2.67	91.2	24.7
BM + COMET & MT-HT	2.63	93.3	33.4

Ablation Study — β=0 (no NLL constraint)¶

Setting	BLEU	MetricX ↓	MT-HT Acc.	MTLD
BM + COMET & HT-OR	21.8	3.59	48.4	88.0
BM + COMET & MT-HT	24.1	3.06	52.2	92.4
BM + COMET & MT-OR	24.4	3.19	59.2	93.1

Removing the NLL constraint yields higher classification accuracy (MT-OR goes from 49.5 to 59.2), but translation accuracy drops significantly.

Key Findings¶

MT-HT reward is the most robust: The reason is that the target side of the MT training data is HT, so the preference of the MT-HT classifier aligns better with the training data distribution.
HT-OR and MT-OR perform worse: Possibly because the classifier favors OR but the target side of the training data is HT, leading to a preference mismatch.
Tailored RR achieves the highest naturalness but the worst accuracy: Scoring 104.3 on MTLD (exceeding human translation's 96.0), but yielding a BLEU of only 21.2.
NLL constraint is crucial: Without it, the model deviates too far from the base MT, causing severe degradation in translation quality.
Book-level analysis: The aligned model outperforms the base MT in MTLD across all 31 test books, approaching or even exceeding human translation in some books.

Highlights & Insights¶

Translational problem reframed through text style transfer: Analogizing MT naturalness optimization to the content-form trade-off in style transfer provides a clear conceptual framework.
A lightweight alternative to RLHF: Replacing the complex scheme of PPO and reference models with a weighted combination of NLL and rewards lowers computational costs.
Multi-perspective design offers diagnostic value: The three classifiers correspond to three viewpoints of evaluating naturalness, revealing the multidimensional meaning of "naturalness".
Flexibly adjustable: The hyperparameter \(\beta\) allows users to flexibly trade off between naturalness and adequacy.

Limitations & Future Work¶

Experiments are restricted to the English \(\rightarrow\) Dutch literary translation language pair and domain; generalizability has not yet been verified.
Utilizing BART trained from scratch, leaving out settings with pre-trained LLMs.
Evaluation of naturalness depends mainly on vocabulary diversity metrics, leaving other aspects of writing style (e.g., grammatical complexity, rhetoric) unaddressed.
Lack of large-scale human evaluation, providing only surface-level qualitative analysis.
Inference stage requires extra post-processing for repeated punctuation.

Freitag et al. (2022) Tagging method: Using <orig>/<tran> tags \(\rightarrow\) a rigid method that cannot be adjusted.
Ploeger et al. (2024) Tailored RR: Reranking improves diversity but sacrifices quality \(\rightarrow\) the multi-perspective reward in this paper better balances both.
Ramos et al. (2024): Using COMET as a single metric for MT reward \(\rightarrow\) this work extends it to multi-dimensional rewards.
Lai et al. (2021a,b): Content-form trade-offs in text style transfer \(\rightarrow\) directly inspired the framework design of this paper.

Rating¶

Novelty: ⭐⭐⭐⭐ — The multi-perspective reward framework is novel, formalizing naturalness optimization as a multi-objective alignment problem.
Experimental Thoroughness: ⭐⭐⭐⭐ — Comprehensive baseline comparisons, complete ablation studies, and detailed book-level analysis, though limited to a single language pair.
Writing Quality: ⭐⭐⭐⭐ — Clear framework description, standardized algorithm pseudocode, and in-depth analysis.
Value: ⭐⭐⭐⭐ — Provides a practical solution for improving MT naturalness, with direct applications in literary translation and the construction of evaluation datasets.