Latent Variable Estimation in Bayesian Black-Litterman Models¶
Conference: ICML2025
arXiv: 2505.02185
Code: Not released
Area: Other/Bayesian, Financial Portfolio Optimization
Keywords: Black-Litterman model, portfolio optimization, Bayesian networks, latent variable models, uncertainty quantification
TL;DR¶
By treating the subjective investor views \((q, \Omega)\) in the classical Black-Litterman portfolio optimization model as latent variables, this paper automatically infers them from market feature data via a Bayesian network. This eliminates reliance on manual subjective inputs, improving the Sharpe ratio by approximately 50% and reducing the turnover rate by around 55% on 30-year Dow Jones and 20-year ETF datasets.
Background & Motivation¶
Limitations of Prior Work in Classical BL Models: The Black-Litterman (1992) model builds on Markowitz mean-variance optimization by introducing a market equilibrium prior and investor views to generate more stable portfolio weights. However, the model requires investors to manually provide subjective forecast vectors \(q \in \mathbb{R}^k\) and corresponding uncertainty matrices \(\Omega \in \mathbb{R}^{k \times k}\), such as "Asset 2 will outperform Asset 1 by 9±3%". This dependence leads to:
Subjective Bias: Views provided by different investors vary widely, making results irreproducible.
Error Propagation from External Estimators: Existing works use external models like GARCH, LSTM, or SVM to generate \((q, \Omega)\), but embedding independent estimators into the BL framework triggers multi-stage error accumulation.
Inconsistent Confidence Calibration: Estimating \(q\) and \(\Omega\) independently can lead to overconfidence (low \(\Omega\)), amplifying errors in \(q\).
Core Idea: Treat \((q, \Omega)\) as latent variables and directly infer their posterior distributions from feature data within a unified Bayesian network, achieving end-to-end data-driven portfolio optimization.
Method¶
Overall Architecture¶
Based on the BLB (Black-Litterman-Bayes) model, this paper constructs a feature-integrated Bayesian network and proposes three model variants based on two causal effects of features:
- M-BL (Mixed Effects): Leverages both effects simultaneously, applicable when views are known.
- SLP-BL (Shared Latent Parameterization): Features and asset returns share parameter \(\theta\) (Effect 1), suitable for asset-specific features.
- FIV-BL (Feature-Influenced Views): Features act indirectly by influencing views (Effect 2), suitable for macro, non-asset-specific features.
Core Modeling¶
BLB Foundation: Asset returns follow \(r \sim N(\theta, \Sigma)\), parameter prior is \(\theta \sim N(\theta_0, \Sigma_0)\), and view likelihood is \(P\theta = q + \epsilon, \; \epsilon \sim N(0, \Omega)\). The posterior is:
where \(G = \Sigma_0^{-1} + P^\top \Omega^{-1} P\).
SLP-BL Model (Main Experimental Model)¶
Introduce a \(\theta \leftrightarrow F\) linear model: \(\theta = \alpha^F + F\beta^F + \epsilon^F, \; \epsilon^F \sim N(0, \Omega^F)\).
The feature matrix \(F = \text{diag}(f_1^\top, \ldots, f_m^\top) \in \mathbb{R}^{m \times dm}\) has a block-diagonal structure, with each asset having \(d\)-dimensional features.
Posterior estimation (closed-form):
where \(G^F = \Sigma_0^{-1} + (\Omega^F)^{-1}\). The predictive distribution is \(\tilde{r} \sim N(\mu_\theta, \Sigma + (G^F)^{-1})\).
Key Characteristic: Degenerates to the classical BL when features recover original views (\(\alpha^F + F\beta^F \to P^{-1}q\)).
FIV-BL Model¶
Introduce a \(q \leftrightarrow F\) linear model: \(q = P(\alpha + F\beta + \epsilon^F)\), and specify an Inverse-Wishart prior for the latent variable \(\Omega\), yielding a multivariate \(t\)-distribution posterior after marginalization. Numerical approximation is required.
Hyperparameter Estimation¶
- \(\Sigma\): Sample covariance; \(\Sigma_0 = \tau \Sigma\) where \(\tau \in (0, 1]\).
- \(\Omega^F\): Constructed using Silverman's bandwidth rule based on kernel density estimation as \(\hat{\Omega}^F = B\tilde{H}B^\top\).
- \((\alpha^F, \beta^F)\): Maximum likelihood estimation via regression of historical returns on features.
Key Experimental Results¶
Datasets¶
| Dataset | Time Span | Number of Assets |
|---|---|---|
| SPDR Sector ETFs | 2004–2024 (20 years) | 11 sector ETFs |
| Dow Jones Index | 1994–2024 (30 years) | 41 constituents |
Main Results: SPDR Sector ETFs¶
| Model | Cumulative Return (%) | CAGR (%) | Sharpe | Max Drawdown (%) | Annualized Volatility (%) |
|---|---|---|---|---|---|
| S&P500 | 545.77 | 6.69 | 0.59 | 55.19 | 19.03 |
| MV(100d) | 411.83 | 5.84 | 0.57 | 36.37 | 16.91 |
| BL(100d) | 602.75 | 7.01 | 0.70 | 46.05 | 15.91 |
| MV(150d) | 249.11 | 4.44 | 0.45 | 47.49 | 17.37 |
| BL(150d) | 556.13 | 6.75 | 0.68 | 44.54 | 15.91 |
Main Results: Dow Jones Index¶
| Model | Cumulative Return (%) | CAGR (%) | Sharpe | Max Drawdown (%) | Annualized Volatility (%) |
|---|---|---|---|---|---|
| DJIA | 932.51 | 5.49 | 0.52 | 53.78 | 17.97 |
| MV(120d) | 1577.61 | 6.67 | 0.57 | 46.73 | 20.07 |
| BL(120d) | 4819.83 | 9.33 | 0.87 | 39.81 | 16.42 |
| MV(100d) | 1529.60 | 6.60 | 0.55 | 56.06 | 20.59 |
| BL(100d) | 4557.03 | 9.19 | 0.85 | 39.92 | 16.56 |
Key Findings¶
- Sharpe Ratio: SLP-BL delivers an average improvement of 49.8% over Markowitz (0.66–0.70 vs 0.35–0.57 on ETFs; 0.78–0.87 vs 0.45–0.62 on DJIA).
- Volatility: The BL model exhibits lower volatility than the corresponding MV model across all window lengths (reducing it by approximately 1–4 percentage points).
- Turnover Rate: Reduced by approximately 55.1%, attributed to the more stable portfolio weights under the Bayesian framework.
- Robustness to Hyperparameters: The BL model consistently outperforms MV across all 5 window lengths (50/80/100/120/150 days).
- Cumulative Return: On the DJIA dataset, BL(120d) achieves a cumulative return of 4819% vs 1577% for MV(120d).
Highlights & Insights¶
- Elegant Unified Framework: Integrates feature integration and parameter inference into a single Bayesian network, avoiding error propagation in multi-stage pipelines. The closed-form solution ensures fast and stable inference.
- Theoretical Degeneration Properties: Demonstrates that both classical BL and Markowitz models are special cases of the proposed model, and that true returns can be recovered under the limit of perfect information.
- Complementary Design of Two Configurations: SLP-BL handles asset-specific features, whereas FIV-BL handles macro/non-asset-specific features (e.g., interest rates, CPI). They can be combined in practice.
- Complete Elimination of Subjective Inputs: First to achieve end-to-end inference from feature data within the BL framework without requiring manually specified views.
Limitations & Future Work¶
- Limited Feature Selection: Utilizes only 9 generic technical indicators based on price/volume, without incorporating fundamental, macroeconomic, or alternative data.
- No Experiments on FIV-BL: Experiments in the paper only validate SLP-BL; FIV-BL requires numerical approximation methods (e.g., MCMC), and its computational cost and practicality are not discussed.
- Linear Assumption: The relationship between features and parameters is assumed to be linear, which may fail to capture non-linear market dynamics.
- Lack of Transaction Costs: Backtesting does not account for real-world frictions such as transaction costs and slippage.
- Monthly Rebalancing Only: High-frequency or dynamic rebalancing strategies are not explored.
- Weak Baselines: No comparison with modern quantitative finance methods such as deep learning or reinforcement learning.
Related Work & Insights¶
- Black & Litterman (1992): The classical BL model, which is the direct target for improvement in this work.
- Kolm & Ritter (2017, 2021): The BLB model, which reformulates BL into a Bayesian inference framework, though it still relies on subjective views.
- Beach & Orlov (2007); Kara et al. (2019): Representative works using external models (e.g., GARCH, SVM) to generate views.
- Markowitz (1952): Classical mean-variance optimization, the baseline in this paper.
- Insight: The concept of "converting heuristic inputs into latent variables" can be generalized to other Bayesian models requiring expert knowledge.
Rating¶
- Novelty: ⭐⭐⭐⭐ (Treating BL views as latent variables is a natural yet under-explored direction)
- Experimental Thoroughness: ⭐⭐⭐ (Long-term real-world data is used, but modern baselines and FIV-BL experiments are lacking)
- Writing Quality: ⭐⭐⭐⭐ (Theoretical derivations are clear, but LaTeX notations are dense, and the approximation handling of FIV-BL is somewhat rushed)
- Value: ⭐⭐⭐⭐ (Directly relevant to quantitative finance practice, with solid theoretical contributions)