Skip to content

Metacognitive Sensitivity for Test-Time Dynamic Model Selection

Conference: NeurIPS 2025 (CogInterp Workshop)
arXiv: 2512.10451
Code: To be confirmed
Area: Multimodal VLM / Model Selection / AI Metacognition
Keywords: metacognition, meta-d', dynamic model selection, contextual bandit, signal detection theory

TL;DR

Inspired by the concept of metacognitive sensitivity (meta-d') from cognitive science, this paper proposes a test-time dynamic model selection framework that quantifies a model's ability to "know what it doesn't know" via meta-d', combines it with instantaneous confidence scores to form a context vector, and employs a contextual bandit to online-select the optimal model, outperforming individual models across multiple datasets.

Background & Motivation

Background: Deep learning has become increasingly specialized—CNNs excel at visual perception, Transformers/LLMs dominate NLP, and VLMs bridge cross-modal tasks. The No Free Lunch theorem dictates that no single architecture is optimal for all problems, motivating the need for dynamic model selection.

Limitations of Prior Work: Probabilistic confidence scores produced by models are often severely miscalibrated—i.e., confidence does not align with actual accuracy—making confidence-based selection unreliable.

Cognitive Science Inspiration: Human metacognition research offers mature mathematical tools for assessing "an agent's ability to evaluate its own knowledge." Among these, meta-d' is a signal detection theory-based metric that measures metacognitive sensitivity while remaining decoupled from task performance and confidence bias.

Core Idea: This work elevates meta-d' from a diagnostic tool to a functional signal, embedding it within a bandit selection framework to enable adaptive model selection at test time.

Method

Problem Formulation

Given a pair of pretrained models \(M = \{M_A, M_B\}\) and an image sequence \(D = \{x_1, \ldots, x_N\}\), the goal is to learn a selection policy \(\pi\) that, for each input \(x_t\), chooses the model most likely to produce a correct prediction: $\(\max_{\pi}\sum_{t=1}^{N} R_t = \max_{\pi}\sum_{t=1}^{N} \mathbb{I}(\hat{y}_{a_t,t} = y_t)\)$

Framework Core: Dual-Signal Context + Bandit Selection

Context vector (4-dimensional): $\(s_t = [c_{A,t},\; \mu_{A,t},\; c_{B,t},\; \mu_{B,t}]\)$ - Short-term signal \(c_{k,t}\): instantaneous confidence (softmax maximum) of model \(M_k\) on the current sample \(x_t\) - Mid-term trait \(\mu_{k,t}\): metacognitive sensitivity (meta-d') of model \(M_k\), reflecting the stable trait of its recent ability to predict accuracy from confidence

Meta-d' Computation: - Based on the hierarchical Bayesian framework of Fleming & Daw, computed by fitting confidence distributions over correct and incorrect trials - Advantage: independent of task performance (d') and overall confidence bias, purely measuring metacognitive sensitivity - The authors developed a GPU-parallelized package to accelerate computation

Dynamic Update Mechanism: 1. Burn-in phase: The first \(B=100\) trials collect (confidence, reward) data from all models to compute the initial \(\mu_{k,0}\) 2. Sliding window update: Every \(F=50\) trials, meta-d' is recomputed using the most recent \(W=100\) trials 3. This enables the framework to adapt to non-stationary changes in model performance (e.g., distribution shift)

Bandit Algorithms: - LinUCB: \(\pi_t(s_t, k) = \hat{\theta}_k^\top s_t + \alpha\sqrt{s_t^\top A_k^{-1} s_t}\), selecting \(a_t = \arg\max_k \pi_t(s_t, k)\) - LinTS: samples \(\tilde{\theta}_k \sim \mathcal{N}(\hat{\theta}_k, \sigma^2 A_k^{-1})\), selecting \(a_t = \arg\max_k \tilde{\theta}_k s_t^\top\) - At each step, the reward \(R_t = \mathbb{I}(\hat{y}_{a_t,t} = y_t)\) is observed and used to update \(A_k\) and \(b_k\)

Key Experimental Results

CNN Model Pairs on CIFAR-10

Model Pair 300 trials 700 trials 1000 trials
AlexNet-ViT 62.4 → 69.5 (+7.1%) 64.8 → 66.2 (+1.4%) 62.4 → 65.9 (+3.5%)
EfficientNet-ViT 67.7 → 75.9 (+8.2%) 66.4 → 68.0 (+1.6%) 66.4 → 67.8 (+1.4%)
AlexNet-GoogleNet 62.7 → 70.6 (+7.9%) 57.7 → 57.5 (-0.2%) 56.8 → 58.4 (+1.6%)
EfficientNet-GoogleNet 54.8 → 59.0 (+4.8%) 53.6 → 55.8 (+2.2%) 54.8 → 57.3 (+2.5%)

VLM Model Pairs on CIFAR-10 + PACS (Domain Shift Setting)

Model Pair 1500 trials 2500 trials 4000 trials
MetaCLIP-SigLIP 98.7 → 99.0 (+0.3%) 98.7 → 98.6 (0.0%) 98.4 → 98.5 (+0.1%)
CLIP-ALIGN 94.2 → 96.0 (+1.8%) 94.8 → 96.2 (+1.6%) 94.8 → 95.8 (+1.0%)

Key Findings

  • Gains are most pronounced in early trials (+4.8% ~ +8.2%), stabilizing at +1.4% ~ +3.5% as the bandit converges
  • Heterogeneous architecture pairs (CNN + Transformer) yield greater benefits than homogeneous pairs, as inductive bias diversity reduces correlated errors
  • When AlexNet's meta-d' decreases, the bandit automatically shifts toward GoogleNet, demonstrating adaptive capability
  • Gains from VLM pairs are modest (+0.1% ~ +1.8%), as VLMs are already highly accurate individually

Highlights & Insights

  • ⭐⭐⭐⭐ Interdisciplinary Innovation: Operationalizing the cognitive science concept of metacognition (meta-d') as a functional component in ML systems is conceptually novel
  • ⭐⭐⭐ Dual Timescale Modeling: The separation of short-term confidence and mid-term metacognitive sensitivity is insightful
  • ⭐⭐⭐ Adaptability: Sliding window updates enable the framework to handle non-stationary scenarios
  • ⭐⭐⭐ Lightweight and Practical: Requires no additional training, relying solely on existing model outputs

Limitations & Future Work

  1. Validation is limited to image classification; extension to more complex tasks such as generation and retrieval has not been explored
  2. The framework is currently restricted to selection between two models; scaling to a larger model pool (>2) poses open challenges in computation and policy design
  3. Meta-d' computation requires a window of 100 trials, which may lack flexibility for small-batch real-time deployment scenarios
  4. Limited gains in the VLM setting suggest diminishing marginal returns when individual models are already sufficiently strong
  5. As a workshop paper, the experimental scale and depth of analysis leave room for further development

Rating

⭐⭐⭐ An interesting interdisciplinary workshop paper that introduces the cognitive science concept of metacognition into dynamic model selection with a novel formulation. Meta-d' as a quantitative measure of a model's "self-awareness" carries unique value, yet the limited experimental scale and task diversity mean further validation is needed before practical deployment.

Method Category Representative Method Adaptive Utilizes Metacognition Computational Cost
Static Ensemble Majority vote / averaging High (all models run)
Dynamic Ensemble Selection Local accuracy Medium
MoE Gating network High (end-to-end training)
Ours meta-d' + Bandit Low (no additional training)