Decomposition and Preprocessing of Ternary Constraint Networks¶

Conference: AAAI 2026 arXiv: 2511.11872 Code: Turbo solver Area: Other Keywords: Constraint Programming, Ternary Constraint Networks, GPU Solving, Preprocessing, MiniZinc

TL;DR¶

This paper proposes a complete theoretical framework for formally decomposing arbitrary discrete constraint networks into ternary constraint networks (TCNs), and reduces the variable/constraint blowup introduced by decomposition from a median of 8x/6x to 4.8x/4.3x via seven preprocessing techniques (propagation, algebraic simplification, common subexpression elimination, etc.), providing a regularized data layout for efficient constraint solving on GPU hardware.

Background & Motivation¶

Constraint Programming (CP) is a general-purpose exact solving method based on constraint propagation and backtracking search. Modern solvers (Choco, OR-Tools) implement propagators via interpreter-style traversal of constraint ASTs (view-based propagation), handling diverse constraint forms through subtype or parametric polymorphism.

Core Challenges: - Irregular data layout: Varying arities and operators across constraints cause severe thread divergence during GPU execution - Complex propagator implementation: It is infeasible to implement distinct propagator functions for every possible constraint form - GPU parallelism bottleneck: Constraint solving inherently requires sequential backtracking search, which conflicts with the SIMT execution model given irregular constraint structures

Proposed Approach: Uniformly decompose all constraints into ternary constraints of the form \(x = y \odot z\), where \(\odot \in \{+, *, /, mod, min, max, =, \leq\}\), yielding a regular data layout suitable for GPU execution.

Method¶

Overall Architecture¶

Given an arbitrary constraint network \(\langle d, C \rangle\), processing proceeds in two stages: 1. TCN Decomposition (Section 3): A recursive function \(tcn(d, C) = \langle d', C' \rangle\) rewrites each constraint into ternary form 2. Preprocessing (Section 4): Seven optimization techniques reduce the size of the decomposed network

Key Designs¶

Module 1: TCN Decomposition Function \(tc(d, t)\)¶

The core recursive function \(tc(d, t) = (d', T, x)\) rewrites a term or constraint \(t\) into a set of ternary constraints \(T\):

Base cases: Variables are returned directly; constants create shared __CONSTANT_k variables
Unary constraints: Negation \(-t \to 0-t\); absolute value \(abs(t) \to max(t, 0-t)\); set membership \(t \in S\) is converted to an interval disjunction
Binary operations: \(t_1 \odot t_2\) introduces an auxiliary variable \(x = y \odot z\); subtraction exploits relational semantics \(x = y-z \Leftrightarrow y = x+z\)
Logical connectives: \(\land \to min\), \(\lor \to max\), \(\Leftrightarrow \to =\); non-Boolean variables are mapped to \([0,1]\) via a booleanize function

The decomposition guarantees solution-set equivalence (Proposition 3.2) and uniqueness (Proposition 3.4).

Module 2: Seven Preprocessing Techniques¶

Preprocessing computes the greatest fixpoint over the triple \(\langle d, C, E \rangle\):

Technique	Purpose	Key Mechanism
Constraint Propagation	Reduce variable domains	Compute greatest fixpoint of propagators
Algebraic Simplification (AS)	Eliminate constraints directly expressible via domains	Pattern matching: \(x=y+0 \to merge(E,x,y)\), \(x=y*1 \to merge(E,x,y)\), 25+ rules
Identical Common Subexpression Elimination (ICSE)	Merge constraints with identical right-hand sides	Hash map \(y \odot z \to x\), \(O(\\|C\\|)\) complexity
Equivalence Class Domain Merging	Synchronize domains of equivalent variables	\(d_E(x) = \bigcap_{y \in [x]_E} d(y)\)
Implied Constraint Elimination	Remove constraints implied by current domains	Check \(\gamma(d) \subseteq rel(c)\)
Variable Renaming	Assign a unique representative to each equivalence class	Union-Find data structure
Useless Variable Elimination	Remove variables not appearing in any constraint scope	Empty-domain variables retained to detect unsatisfiability

Module 3: Equivalence Partitioning and Fixpoint Computation¶

Variable equivalence partitioning \(E\) is implemented via Union-Find, with equivalence classes merged through \(merge(E, x, y)\). Preprocessing iterates until neither \(d\) nor \(E\) changes. Implied constraint checking and useless variable elimination are extracted outside the fixpoint loop (guaranteed by monotonicity).

Loss & Training¶

This is a theoretical/systems paper with no neural network loss functions. The optimization objective is to minimize the size (number of variables and constraints) of the decomposed constraint network while preserving solution-set equivalence.

Experiments¶

Main Results: Decomposition Size Analysis (89 instances, MiniZinc 2024 Challenge)¶

Stage	Var Growth (avg)	Var Growth (med)	Var Growth (max)	Con Growth (avg)	Con Growth (med)	Con Growth (max)
FlatZinc	9.42x	1.86x	111.62x	24.94x	2.95x	486.87x
TCN Decomposition	53.61x	7.97x	1133.22x	76.68x	6.21x	1837.19x
After Preprocessing	22.06x	4.76x	344.62x	36.39x	4.33x	746.09x

Ablation Study: Preprocessing Effectiveness¶

Metric	Variable Reduction (TCN→Preprocessed)	Constraint Reduction (TCN→Preprocessed)
Mean reduction	53.61x → 22.06x (2.4×)	76.68x → 36.39x (2.1×)
Median reduction	7.97x → 4.76x (1.7×)	6.21x → 4.33x (1.4×)
Preprocessing time (avg)	24.22s (std=96.91s)	—
Preprocessing time (med)	0.91s	Only 11 instances >10s

Key Findings¶

FlatZinc stage already introduces the primary blowup: For 63/89 instances, FlatZinc decomposition remains within one order of magnitude compared to Choco
Preprocessing substantially reduces network size: On average, it reduces the variable growth introduced by TCN decomposition by 2.5× and constraint growth by 1.5×
Operator distribution is highly concentrated: After preprocessing, instances predominantly use only three operators—addition, maximum (encoding disjunction), and reified equality—effectively eliminating thread divergence concerns
12 hard-to-handle instances: 12 instances still exhibit over 100× blowup after decomposition, representing extreme cases

Highlights & Insights¶

Theoretical completeness: Solution-set equivalence and uniqueness of TCN decomposition are formally proved, making the framework self-contained and directly implementable
Practical effectiveness: Seven preprocessing techniques reduce blowup to an acceptable level (median 4.8× variables / 4.5× constraints)
Educational value: TCN formalization is applicable to CP pedagogy and has been validated in teaching settings
GPU-friendly: The regularized ternary layout eliminates thread divergence caused by operator diversity

Limitations & Future Work¶

Global constraints are not natively supported (relying on the MiniZinc compiler for pre-decomposition), limiting solving efficiency
12 instances still exceed 100× blowup after decomposition, requiring stronger preprocessing techniques
Preprocessing time exhibits extremely high variance (std=96.91s vs. median=0.91s), with prohibitive cost on individual instances
No end-to-end GPU solver performance improvement is demonstrated

TCNs in constraint logic programming: CLP(FD), clp(fd), and cc(FD) employed ternary forms early on but without systematic formalization
View-based propagation: The AST interpreter approach in Choco/OR-Tools avoids decomposition but is ill-suited for GPU execution
Preprocessing techniques: Savile Row and Andrea Rendl's compilation optimizations
GPU constraint solving: The Turbo solver, proposed at AAAI 2022, introduced GPU-based concurrent constraint programming

Rating¶

⭐⭐⭐ (Theoretically rigorous with high formalization value, but lacks end-to-end GPU performance validation; practical impact remains to be observed)