Pareto-Conditioned Diffusion Models for Offline Multi-Objective Optimization¶

Conference: ICLR 2026 arXiv: 2602.00737 Code: GitHub Area: Image Generation Keywords: Offline multi-objective optimization, conditional diffusion models, Pareto front, surrogate-free, reference directions

TL;DR¶

This paper proposes Pareto-Conditioned Diffusion (PCD), which reformulates offline multi-objective optimization as a conditional sampling problem. PCD directly generates high-quality solutions conditioned on objective trade-offs without requiring explicit surrogate models, achieving the best overall consistency across diverse benchmarks.

Background & Motivation¶

Offline MOO challenge: Only a static dataset is available; the true objective functions cannot be queried.
Reliance on surrogate models: Existing methods approximate objective functions with DNNs or GPs, then perform MOEA search, creating a surrogate accuracy bottleneck.
Generative model approaches (e.g., ParetoFlow) still rely on surrogate predictors for guidance, inheriting the risk of surrogate inaccuracy.
Core idea: Directly model MOO as a conditional generation task \(p(\boldsymbol{x} | \boldsymbol{y}; \sigma)\).

Method¶

Overall Architecture¶

PCD unifies solution generation and Pareto front modeling: a conditional diffusion model is trained and then used to sample new solutions conditioned on target objective vectors.

1. Multi-Objective Reweighting Strategy¶

Bin-based weighting via dominance number:

\[w_i = \frac{|B_i|}{|B_i| + K} \exp\left(\frac{-\frac{1}{|B_i|}\sum_{j=1}^{|B_i|} o(\boldsymbol{x}_{b_j})}{\tau}\right)\]

where \(o(\boldsymbol{x}) = \sum_{\boldsymbol{x}' \in \mathcal{D}} \mathbb{I}[\boldsymbol{f}(\boldsymbol{x}) \prec \boldsymbol{f}(\boldsymbol{x}')]\) denotes the dominance number.

Two desiderata: 1. Bins containing more data points receive higher weights (greater reliability). 2. Bins with better average performance receive higher weights (greater importance).

2. Reference Direction-Conditioned Point Generation¶

A three-step procedure inspired by NSGA-III: 1. Direction vector generation: Generate \(L\) direction vectors \(\boldsymbol{w}_i\) using the Riesz s-Energy method. 2. Point–direction pairing: Iteratively assign data points to the nearest direction vector based on non-dominated sorting. 3. Extrapolation + Gaussian perturbation: Extrapolate representative points along their assigned directions and add zero-mean Gaussian noise to increase diversity.

3. Classifier-Free Guidance Sampling¶

The modified ODE is:

\[d\boldsymbol{x}/d\sigma = -(\gamma D_\theta(\boldsymbol{x}; \hat{\boldsymbol{y}}, \sigma) + (1-\gamma) D_\theta(\boldsymbol{x}; \sigma) - \boldsymbol{x})/\sigma\]

Setting \(\gamma > 1\) amplifies the influence of the conditioning objective, steering samples toward regions consistent with \(\hat{\boldsymbol{y}}\).

Loss & Training¶

Reweighted conditional denoising \(L_2\) loss:

\[\theta = \arg\min_\theta \mathbb{E} [w(\boldsymbol{y}) \lambda(\sigma) \|D_\theta(\boldsymbol{x} + \boldsymbol{n}; \boldsymbol{y}, \sigma) - \boldsymbol{x}\|_2^2]\]

Key Experimental Results¶

Average Rank Across Tasks (100th Percentile HV, ↓ Lower is Better)¶

Method	Synthetic	MORL	RE	Scientific	MONAS	Overall
\(\mathcal{D}\)(best)	5.45	1.70	2.60	9.35	11.53	7.43
ParetoFlow	2.44	8.50	1.74	9.05	11.19	6.74
MM + IOM	5.16	12.70	5.76	4.40	5.77	5.80
E2E	6.16	9.70	6.06	4.20	5.13	5.71
PCD	3.38	5.50	1.51	4.05	7.54	4.80

Ablation Study: Component Contributions¶

Variant	ZDT2	MO-Swimmer	RE34	Regex	C10/MOP2
Ideal + N/A	7.59	1.76	9.19	5.60	10.46
Ref.Dir. + N/A	7.89	3.53	10.11	5.55	10.47
Ref.Dir. + Pruning	5.64	3.63	10.16	4.20	10.55
PCD (full)	6.25	3.69	10.17	4.80	10.59

Key Findings¶

PCD achieves the best overall rank across all task categories using a single fixed set of hyperparameters.
The reference direction mechanism nearly doubles HV on MO-Swimmer (1.76→3.53).
The reweighting strategy consistently outperforms simple pruning (the approach of Xue et al., 2024).
The gain from guidance scale \(\gamma\) is limited (near saturation at 2.5), as reweighting already biases the data distribution.

Highlights & Insights¶

End-to-end framework: Collapses the multi-stage pipeline (surrogate + search) into a single conditional generative model.
Cross-task consistency: This is PCD's most notable advantage—robust performance across continuous, discrete, and categorical tasks.
NSGA-III-inspired conditioned point generation: Elegantly combines the reference direction idea from evolutionary algorithms with conditional generation in diffusion models.

Limitations & Future Work¶

MORL tasks (~10,000-dimensional parameter spaces) are constrained by the MLP denoiser operating directly in parameter space.
Purely categorical search spaces in MONAS pose a challenge for continuous diffusion models.
Combinatorial optimization tasks (e.g., TSP) are not addressed.
Reweighting may be detrimental on datasets whose data quality is already high.

Surrogate-based methods: COMs, ICT, IOM, Tri-Mentoring
Generative model approaches: ParetoFlow, LaMBO, MOGFNs
Conditional diffusion: DDOM, MINs, Reward-Directed Diffusion

Rating¶

Novelty: ⭐⭐⭐⭐ — Reformulating offline MOO as conditional sampling is a natural yet effective contribution.
Technical Depth: ⭐⭐⭐⭐ — The reweighting strategy and reference direction mechanism are well motivated.
Experimental Thoroughness: ⭐⭐⭐⭐⭐ — Covers 5 major benchmark categories against 13 baseline methods.
Value: ⭐⭐⭐⭐ — Hyperparameter robustness makes practical deployment more feasible.