Overfitting in Adaptive Robust Optimization¶

Conference: NeurIPS 2025 (Workshop: MLxOR) arXiv: 2509.16451 Code: None Area: Other Keywords: Adaptive Robust Optimization, Overfitting, Regularization, Uncertainty Set, Affine Decision Rules

TL;DR¶

This paper establishes an analogy between policy fragility in Adaptive Robust Optimization (ARO) and overfitting in machine learning: adaptive policies perform well within the uncertainty set but may fail outside it. The paper proposes constraint-specific uncertainty set sizing as a "regularization" mechanism to balance robustness and adaptability.

Background & Motivation¶

Robust Optimization (RO) is a major framework for decision-making under uncertainty, requiring decisions to be feasible for all realizations within a specified uncertainty set \(\mathcal{U}\). In static RO, the decision variable \(\boldsymbol{x}\) is fixed before uncertainty is realized.

Adaptive Robust Optimization (ARO) allows certain decisions to be adjusted after observing the uncertainty. A common tractable approach is the affine decision rule:

\[\boldsymbol{x}(\boldsymbol{u}) = \boldsymbol{z} + \boldsymbol{V}\boldsymbol{u}\]

where \(\boldsymbol{z}\) is a fixed component and \(\boldsymbol{V}\) is the adaptability coefficient matrix. The static solution corresponds to \(\boldsymbol{V} = \boldsymbol{0}\), so adaptive policies weakly dominate static ones within the uncertainty set.

Core Problem: ARO causes constraints that were originally independent of uncertainty to become uncertainty-dependent. For example, the non-negativity constraint \(x_i \geq 0\) becomes \(z_i + V_{i,i} u_i \geq 0\) after adaptation, which may be violated when \(u_i\) deviates from the uncertainty set. This fragility directly mirrors overfitting in machine learning: greater flexibility yields better in-sample performance but poor generalization.

Method¶

Overall Architecture¶

The paper develops its analysis through three progressive steps:

Toy example illustrating fragility: a renewable energy production planning problem
Constraint-specific uncertainty sets: distinguishing hard and soft constraints
Regularization perspective: robust counterparts as implicit regularization

Key Designs¶

1. Fragility Example¶

Consider a renewable energy dispatch problem with 2 units of forecasted renewable energy in each of two periods, with uncertainty represented by a budget set:

\[\mathcal{U} = \{(u_1, u_2): \|\boldsymbol{u}\|_\infty \leq \rho, \|\boldsymbol{u}\|_1 \leq \Gamma\}\]

Comparison of three solutions (\((\rho, \Gamma) = (1, 1)\)):

Policy	Solution	Objective	Out-of-set Performance
Nominal optimal	\(x_i = y_i = 1, s_i = 0\)	4	Any shortfall requires expensive grid power
Static robust	\(x_i = 1, y_i = s_i = 0\)	2	Robust but overly conservative
Adaptive robust	\(x_i = 1, y_i = 1 + u_i\)	3	If \(u_1 < -1\), then \(y_1 < 0\) (physically infeasible)

Key observation: ARO exploits the \(\ell_1\) ball to couple uncertainty across constraints (which static RO cannot), but at the cost of making non-negativity constraints dependent on \(u\).

2. Constraint Classification and Uncertainty Set Design¶

Hard constraints (e.g., flow non-negativity): must be satisfied for all possible realizations, requiring deterministic guarantees; the uncertainty set should fully cover the support of \(\boldsymbol{u}\).

Soft constraints (e.g., renewable energy allocation with backup grid power): can tolerate limited violations; probabilistic guarantees suffice. Ellipsoidal or budget sets provide explicit guarantees while preserving ARO flexibility.

Specific guarantee forms:

Gaussian case: ellipsoidal uncertainty set \(\mathcal{U}_i = \{\boldsymbol{u}: \|\Sigma^{-1/2}(\boldsymbol{u} - \boldsymbol{\mu})\|_2 \leq \rho_{1-\varepsilon}\}\)
Distribution-free (ball-box set): \(\rho = \sqrt{2\ln(1/\varepsilon)}\) guarantees \(1-\varepsilon\) probability
Distribution-free (budget set): \(\Gamma = \sqrt{2\ln(1/\varepsilon)} \sqrt{p}\) guarantees \(1-\varepsilon\) probability
Bounded support: polyhedral box sets yield robust counterparts via duality arguments (Proposition 1)
Semi-bounded support: handled as a special case (Corollary 1)

3. Regularization Perspective¶

The robust counterpart (RC) can be reformulated as an explicit norm constraint on the adaptability coefficients. For example, under a ball-box set (setting \(\boldsymbol{y}_i = \boldsymbol{0}\) for simplicity):

\[\|\boldsymbol{V}^\top \boldsymbol{a}_i - \boldsymbol{d}_i\|_2 \leq \frac{b_i - \boldsymbol{a}_i^\top \boldsymbol{z}}{\rho}\]

The probabilistic guarantee \(1 - \varepsilon\) serves as a regularization parameter: - Stronger guarantee (smaller \(\varepsilon\)) → tighter constraint on \(\boldsymbol{V}\) → less adaptability → greater stability - Limit \(\varepsilon \to 0\) → \(\boldsymbol{V} = \boldsymbol{0}\), recovering the static policy - The \(\ell_2\) norm constraint directly corresponds to ridge regression regularization in machine learning

Loss & Training¶

This paper does not involve machine learning training. Its core contribution is conceptual: - Establishing an analogy between ARO fragility and ML overfitting - Proposing constraint-specific uncertainty set sizing as regularization - Demonstrating that the bias-variance tradeoff manifests in ARO

Key Experimental Results¶

Main Results: Fragility Demonstration¶

Policy	Uncertainty Set	Objective	Non-negativity Guarantee
Static robust	\((\rho, \Gamma) = (1,1)\)	2	Feasible for all realizations
ARO (uniform set)	\((\rho, \Gamma) = (1,1)\)	3	Violated when \(u_i < -1\)
ARO (constraint-specific set)	Soft: \((1,1)\), Hard: \((2,2)\)	3	Feasible for all \(u_i \geq -2\)

Constraint-specific uncertainty sets preserve the ARO objective advantage (3 vs. 2) while restoring feasibility guarantees for hard constraints.

Regularization Effect Analysis¶

Probabilistic Guarantee \(1-\varepsilon\)	Max Adaptability \(\\|\boldsymbol{V}^\top \boldsymbol{a}\\|_2\) (Gaussian)	Max Adaptability (Distribution-free)
0.90	~0.78	~0.58
0.95	~0.61	~0.46
0.99	~0.43	~0.33
0.999	~0.32	~0.25

Under Gaussian assumptions, the probabilistic bound is tighter, permitting greater adaptability; distribution-free guarantees are looser, leading to more conservative regularization.

Key Findings¶

ARO fragility is systematic: it is not limited to specific problem instances; any adaptive policy risks turning originally uncertainty-independent constraints into uncertainty-dependent ones.
Equivalence between robustness and regularization: in the ARO context, constraint-specific uncertainty set sizing plays the role of a regularization parameter.
Direct manifestation of the bias-variance tradeoff: adaptability (variance) vs. stability (bias) can be precisely controlled through uncertainty set sizing.

Highlights & Insights¶

Strong conceptual contribution: this work is the first to systematically establish an analogy between ARO fragility and ML overfitting.
Power of cross-disciplinary analogy: importing regularization ideas from ML into operations research provides a new perspective for ARO practice.
Clear practical recommendations:
- Evaluate via simulation on out-of-set distributions (analogous to ML out-of-sample testing)
- Distinguish hard and soft constraints and assign different guarantee levels
- Use constraint-specific uncertainty set sizing
Direct correspondence to ridge regression: the \(\ell_2\) norm constraint connects the ARO robust counterpart to ridge regression, with strong potential for cross-domain knowledge transfer.

Limitations & Future Work¶

Restricted to affine decision rules: analysis under more complex decision rules (polynomial, piecewise linear) remains unexplored.
Lack of large-scale numerical experiments: only toy examples are provided; validation on real-world operations research problems is absent.
RHS uncertainty only: only right-hand side uncertainty is considered; uncertainty in the technology coefficient matrix is not addressed.
Regularization parameter selection: no practical guidance is provided on how to select the optimal uncertainty set size for each constraint.

Equivalence of robustness and regularization: Xu et al. and Bertsimas et al. discuss analogous results for static RO; this paper extends the connection to ARO.
Bias-variance tradeoff: the classical ML framework and its counterpart in optimization-based decision-making.
Coupling advantages of ARO: Bertsimas et al. on how ARO exploits coupling across uncertainty realizations.

Rating¶

Novelty: ⭐⭐⭐⭐ — First systematic treatment of overfitting in ARO
Theoretical Contribution: ⭐⭐⭐⭐ — Conceptually clear with complete mathematical derivations
Experimental Thoroughness: ⭐⭐ — Only toy examples
Value: ⭐⭐⭐⭐ — Directly actionable for ARO practitioners
Overall Recommendation: ⭐⭐⭐⭐