Decision-Theoretic Approaches for Improved Learning-Augmented Algorithms¶

Conference: ICLR2026
OpenReview: https://openreview.net/forum?id=XGEZ9q02Ys
Code: To be confirmed
Area: Learning Theory / Learning-Augmented Algorithms / Online Algorithms
Keywords: Learning-Augmented Algorithms, Decision Theory, CVaR, Consistency-Robustness Trade-off, Online Decision Making

TL;DR¶

This paper introduces "distance measures" and "risk measures" from decision theory into the analysis of online algorithms with predictions. By using a quantifiable metric relative to an "ideal algorithm," it identifies the "globally best" algorithm from a family of otherwise incomparable Pareto-optimal/smooth algorithms. It provides computable optimal thresholds for three classical problems: ski rental, unimodal search, and contract scheduling.

Background & Motivation¶

Background: Learning-augmented algorithms (also known as algorithms with ML predictions) have seen a rapid rise. The core idea is that algorithms receive a machine learning prediction (e.g., remaining ski days, maximum asset price) trained from historical data to bypass the pessimistic guarantees of traditional worst-case online algorithms. These algorithms are typically evaluated using three metrics: consistency (performance when predictions are perfectly accurate), robustness (performance when predictions are arbitrarily poor), and the ability to degrade smoothly with prediction error.

Limitations of Prior Work: Since consistency and robustness are inherently in conflict, two design paths have emerged, each with flaws. The first is designing Pareto-optimal algorithms—selecting points on the consistency-robustness boundary. However, Pareto-optimal algorithms are often brittle: performance drops sharply with even minor prediction errors. Furthermore, even if brittleness is avoided, smoothness between members of the same Pareto-optimal family is incomparable. The second is forced smoothness—introducing a tolerance parameter \(\delta\) (reflecting confidence in the prediction). These \(\delta\)-robust algorithms often perform poorly when predictions are very accurate, and \(\delta\) must either be explicitly known or chosen arbitrarily.

Key Challenge: Faced with an infinite number of candidate algorithms, each with its own "performance curve" (transitioning from a consistency endpoint to a robustness endpoint), which curve is truly the best? Traditional metrics only look at specific points or local segments of the curve, lacking a principled global comparison of the entire curve. Comparing performance curves is a fundamental strength of decision theory.

Goal: Establish a set of decision-theoretic measures to transform the task of "selecting the global optimum from a class of otherwise incomparable algorithms" into a quantifiable and computable optimization problem.

Core Idea: Introduce two types of measures—deterministic distance measures (quantifying how far an algorithm is from an "ideal solution") and stochastic risk measures (using CVaR to characterize the risk of deviating from perfect predictions). These measures provide an objective score for each algorithm across the entire prediction error spectrum to identify the optimal one.

Method¶

Overall Architecture¶

The paper builds a unified decision-theoretic framework. The core mechanism is to define an ideal solution \(I_r\) as a benchmark, use two types of measures to calculate the distance of any candidate algorithm from this ideal, and finally solve for the measure-minimizing algorithm under robustness constraints. This framework can be applied to any online problem with single-value predictions, demonstrated via three classical problems (ski rental, unimodal search, contract scheduling).

Notation: For an input sequence \(\sigma\), \(\mathrm{OPT}(\sigma)\) is the offline optimal cost. The cost of algorithm \(A\) given prediction \(y\) on \(\sigma\) is \(A(\sigma,y)\), and its performance ratio is \(\mathrm{pr}(A,\sigma,y)=A(\sigma,y)/\mathrm{OPT}(\sigma)\). Consistency is \(\mathrm{cons}(A)=\sup_\sigma \mathrm{pr}(A,\sigma,x_\sigma)\), and robustness is \(\mathrm{rob}(A)=\sup_{\sigma,y}\mathrm{pr}(A,\sigma,y)\). Prediction error is \(\eta=|x_\sigma-y|\), and the prediction range is \(R_y=[\ell,u]\) (containing the true value \(x_\sigma\)). While \(R_y=[(1-\delta)y,(1+\delta)y]\) corresponds to the \(\delta\)-tolerance setting, the framework defaults to \(R_y=[0,\infty)\) and does not strictly require this bound.

The framework rests on two legs: distance measures take a deterministic approach without distributional assumptions on prediction quality, while risk measures follow a stochastic approach, introducing a distributional prediction \(\mu\) and formalizing "risk" using CVaR. Both unify the goal as "minimizing a measure under robustness constraints" and serve as strict generalizations of existing methods.

Key Designs¶

1. Ideal Solution \(I_r\): An objective anchor for comparison

To quantify how "good" an algorithm is, a reference frame is required. The paper defines the ideal solution \(I_r\) as the minimum cost achievable for input \(\sigma\) under the robustness constraint that the algorithm must be \(r\)-competitive for all inputs. Its performance ratio is \(\mathrm{pr}(I_r,\sigma)=I_r(\sigma)/\mathrm{OPT}(\sigma)\). Intuitively, \(I_r\) is "omniscient but constrained by \(r\)-robustness"—it knows the true input but cannot sacrifice robustness for it. No \(r\)-robust online algorithm can outperform \(I_r\) on any input, i.e., \(\mathrm{pr}(A,\sigma,y)\ge \mathrm{pr}(I_r,\sigma)\).

The brilliance of this design is replacing pairwise comparisons between algorithms with distances to a common ideal solution. Previously incomparable Pareto-optimal algorithms can now be ordered because \(I_r\) provides an absolute metric of "how much performance is missing compared to the ideal." In ski rental, the performance ratio of \(I_r\) is piecewise: \(1\) if \(x<b\), \(x/b\) if \(x\in[b,\min\{br/(r-1),b(r-1)\}]\), and \(r/(r-1)\) thereafter.

2. Weighted Distance Measures: Measuring "distance from ideal" across the error spectrum

The first type of measure directly calculates the "weighted distance" between the candidate algorithm’s performance curve and the ideal curve. Users provide a weight function \(w_y:R_y\to[0,1]\) to express "which prediction error range matters most" (requiring \(w_y\) to be non-decreasing on \([\ell,y]\) and non-increasing on \([y,u]\)). Two specific measures are Maximum Weighted Distance:

\[d_{\max}(A)=\sup_{\sigma,\,x\in R_y}\big\{(\mathrm{pr}(A,\sigma,x)-\mathrm{pr}(I_r,\sigma))\,w_y(x)\big\}\]

and Average Weighted Distance:

\[d_{\mathrm{avg}}(A)=\sup_\sigma \frac{1}{|R_y|}\int_{R_y}(\mathrm{pr}(A,\sigma,z)-\mathrm{pr}(I_r,\sigma))\,w_y(z)\,dz.\]

The former controls for the worst-case deviation (ensuring smoothness), while the latter considers the average deviation over the error range. This draws inspiration from ROC curve evaluation. Its power lies in unification: if the weight is 1 only at \(x=y\) and 0 elsewhere, \(d_{\max}\) reduces to Pareto optimality. Pareto optimality is thus an extreme special case of this framework.

3. CVaR Risk Measure & \(\alpha\)-consistency: Explicitly pricing prediction trust risk

While distance measures are deterministic, "risk" is inherently stochastic. The second type of measure treats the prediction as a distribution \(\mu\) and introduces the Conditional Value-at-Risk (CVaR):

\[\mathrm{CVaR}_\alpha(X)=\inf_t\Big\{t+\tfrac{1}{1-\alpha}\,\mathbb{E}[(X-t)_+]\Big\},\]

which measures the expected loss in the worst \((1-\alpha)\) tail. \(\alpha\in[0,1)\) represents the user's risk aversion. The \(\alpha\)-consistency is defined as:

\[\alpha\text{-cons}(A)=\sup_{F\in\mathcal F}\frac{\mathrm{CVaR}_{\alpha,F}(A(\sigma))}{\mathbb{E}_{\sigma\sim F}[\mathrm{OPT}(\sigma)]},\]

where \(\mathcal F\) is the class of input distributions consistent with the prediction distribution \(\mu\). The goal is to find an \(r\)-robust algorithm that minimizes \(\alpha\)-consistency.

The insight here is that a single parameter \(\alpha\) bridges two extremes: \(\alpha=0\) optimizes for pure expectation (risk-seeking), while \(\alpha\to 1\) assumes the probability is concentrated on the worst-case point within the prediction range (risk-averse). The paper notes that Pareto-optimal algorithms "maximize risk" (hence they are brittle), whereas \(\delta\)-tolerance algorithms "minimize risk" (hence they underperform with accurate predictions). CVaR allows for continuous adjustment between these states.

Mechanism on Three Problems¶

Applying the framework involves finding optimal thresholds/schedules among \(r\)-robust candidates:

Ski Rental: For \(r\ge 2\), \(A_T\) is \(r\)-robust if \(T\in[b/(r-1),b(r-1)]\). The paper derives optimal thresholds \(T^*_{\max}, T^*_{\mathrm{avg}}, T^*_{\mathrm{CVaR}}\) within this interval.
Unimodal Search: For \(r\ge\sqrt M\), a threshold \(T\) is \(r\)-robust if \(T\in[M/r,r]\). The paper provides analytical solutions for \(T^*_{\max}\) under uniform weights and models general weights as a two-player zero-sum game.
Contract Scheduling: Optimal robustness (acceleration ratio) is 4. The paper determines optimal schedules among 4-robust candidates using the critical point method (MAX), numerical optimization (AVG), and exact formulas (CVaR).

Key Experimental Results¶

Experiments compare whether the proposed distance/risk algorithms (MAX, AVG, CVAR\(_\alpha\)) achieve better average performance ratios and expected costs/profits compared to SOTA Pareto-optimal and \(\delta\)-tolerance algorithms under the same robustness constraints.

Main Results: Ski Rental (b=10, r=5)¶

Baseline: BP\(_\rho\) family from Benomar & Perchet (2025).

Algorithm	Avg Performance Ratio	Expected Cost
MAX	1.344	11.241
AVG	1.337	11.187
CVAR\(_{\alpha=0.1}\)	1.340	11.173
CVAR\(_{\alpha=0.5}\)	1.349	11.215
CVAR\(_{\alpha=0.9}\)	1.367	11.316
BP\(_{\rho=b}\)	1.677	16.987
BP\(_{\rho=b(r-1)}\)	2.219	20.958

Ours outperforms BP\(_\rho\) across all \(\rho\) values: the average performance ratio drops from 1.68–2.22 to approximately 1.34.

Main Results: Unimodal Search (M=1000, r=100)¶

Baseline: \(\delta\)-tolerance (δ-TOL) and Pareto-optimal PO1, PO2.

Algorithm	Avg Performance Ratio	Expected Profit
MAX	4.394	15.614
AVG	4.447	20.528
CVAR\(_{\alpha=0.1}\)	9.771	35.795
δ-TOL	10.009	5.475
PO1	4.630	13.904

MAX and AVG show superior performance ratios compared to most baselines. CVAR significantly outperforms δ-TOL and PO2 in both profit and ratio.

Key Findings¶

Even with simple weight functions (linear, symmetric), distance-based algorithms significantly outperform SOTA, implying gains are not due to hyperparameter tuning.
CVaR provides a smoothly adjustable trade-off via \(\alpha\): smaller \(\alpha\) favors expected profit, while larger \(\alpha\) provides stability.
As the prediction range narrows or weights concentrate near the prediction, Ours' performance improves further—validating the theoretical ability to exploit high-quality predictions.

Highlights & Insights¶

New Perspective: Bringing decision theory into competitive analysis is a novel angle. CVaR is applied to learning-augmented competitive analysis for the first time, defining the new "\(\alpha\)-consistency" metric.
Ideal Solution \(I_r\) as a Pivot: It solves the problem of incomparable algorithms by providing an objective anchor. This "absolute benchmark" approach can be transferred to any field with infinite classes of equally robust algorithms.
Unified Explanation: The framework explains why existing methods fail: Pareto optimality maximizes risk (leading to brittleness), while \(\delta\)-tolerance minimizes risk (hurting performance on accurate predictions).
Graceful Degradation: By taking specific weights or \(\alpha\) values, the framework recovers Pareto optimality or adversarial analysis, ensuring it strictly includes prior results.

Limitations & Future Work¶

Problem Range: Only three "single-value prediction, threshold-type" problems were verified. Whether the framework scales to multi-dimensional predictions or complex combinatorial optimization (graphs, bin packing) remains an open question.
Distributional Dependency: Risk analysis relies on a given prediction distribution \(\mu\). The robustness of conclusions when \(\mu\) is misspecified is not deeply explored.
Hyperparameter Selection: The framework shifts the burden of choosing \(\delta\) to choosing \(w_y\) or \(\alpha\). While more principled, it still requires preference expression.

vs Pareto-optimal Path (Sun et al. 2021a; Benomar & Perchet 2025): They select points on the boundary but are brittle. Ours uses distance to an ideal solution to re-order the Pareto family and pick the global optimum.
vs \(\delta\)-tolerance/Smooth Path (Angelopoulos et al. 2022): They force smoothness via \(\delta\), but struggle with high-accuracy predictions. Ours uses CVaR to price risk, covering a broader error spectrum.
vs Diakonikolas et al. (2021): Ours' \(\alpha\)-consistency is a risk-aware generalization of their distributional consistency-robustness trade-off.

Rating¶

Novelty: ⭐⭐⭐⭐⭐
Experimental Thoroughness: ⭐⭐⭐⭐
Writing Quality: ⭐⭐⭐⭐⭐
Value: ⭐⭐⭐⭐