Partially Shared Concept Bottleneck Models¶

Conference: AAAI 2026 arXiv: 2511.22170 Code: github.com/7494zdl/PS-CBM Area: Interpretability Keywords: Concept Bottleneck Models, Interpretability, Vision-Language Models, Concept Efficiency, Image Classification

TL;DR¶

This paper proposes PS-CBM, a framework that integrates multimodal concept generation (combining LLM semantics with visual cues from exemplar images), a partially shared concept strategy (merging concepts based on activation patterns), and a Concept-Efficient Accuracy (CEA) evaluation metric. PS-CBM achieves higher classification accuracy and interpretability with fewer concepts across 11 datasets.

Background & Motivation¶

Problem Definition¶

Concept Bottleneck Models (CBMs) insert a layer of human-interpretable concepts between inputs and predictions to enhance model interpretability. While automated concept generation using LLMs and VLMs has reduced the burden of manual annotation, three fundamental challenges remain.

Three Core Challenges¶

1. Poor Visual Grounding¶

Concepts generated by LLMs are semantically rich but often misaligned with actual visual content.
VLM-based methods improve visual fidelity but sacrifice class-level semantic consistency at high computational cost.
A persistent semantic–visual gap undermines both accuracy and interpretability.

2. Concept Redundancy¶

Independent pool strategy: Concepts are generated independently per class → semantic duplication, with similar concepts redundantly assigned to multiple classes.
Globally shared strategy: Unified deduplication → reduces redundancy but forces unrelated classes to share a fixed concept pool, harming class discriminability.
Both strategies impede model clarity and training stability.

3. Inadequate Metrics¶

Most CBMs are evaluated solely on classification accuracy, ignoring the interpretability cost of large, redundant concept sets.
No principled metric exists to capture the trade-off between accuracy and concept efficiency.
Performance gains may come at the expense of usability.

Paper Goals¶

To design a unified framework that simultaneously addresses all three challenges — introducing a Partially Shared concept strategy that finds an optimal balance between the independent and globally shared extremes.

Method¶

Overall Architecture¶

PS-CBM consists of three stages: 1. Multimodal Concept Generation: Generating a concept set by combining LLM semantics with exemplar images. 2. Partially Shared Concept Strategy: Merging concepts based on activation patterns and assigning them across classes. 3. CBM Training: Learning a transparent prediction model via concept supervision.

Key Designs¶

1. Multimodal Concept Generation¶

Bridging the gap between LLM semantics and visual grounding:

Few-shot image selection: For each class $i$, a diverse few-shot exemplar set $\bm{X}_i \subset \mathcal{X}_i$ is constructed using CLIP embeddings.
- Initialized from random images; iteratively selects new samples with maximum cosine distance from already-selected ones.
- Random sampling is used for noisy datasets (e.g., Food101).
Concept generation: For each class, a prompt combining textual descriptions with selected exemplars is used to query GPT-4o twice to reduce randomness.
- After deduplication, a candidate concept set $\mathcal{S} = \bigcup_{i=1}^{l} \mathcal{S}_i$ is obtained.
- Each concept $\bm{c}_j$ is associated with a class set $\mathcal{C}_j$.

The core contribution, progressively refining the concept set in three steps:

Step 1: Concept Filtering - Compute an image–concept affinity matrix $\bm{A}_{i,j} = \cos(\Phi(\bm{x}_i), \Psi(\bm{c}_j))$, where $\Phi$ is the image encoder and $\Psi$ is the text encoder. - A concept $\bm{c}_j$ is retained if its top-4 average alignment with class images exceeds a confidence threshold $\tau_{\text{conf}}$.

Step 2: Concept Merging - Compute a correlation matrix $\bm{Q}$ over filtered concepts: $\bm{Q}_{i,j} = \frac{\bm{A}_{:,i}^\top \bm{A}_{:,j}}{\|\bm{A}_{:,i}\| \|\bm{A}_{:,j}\|}$ - Greedy merging: select the concept with the largest number of mergeable candidates as the representative; merge all concepts exceeding threshold $\tau_{\text{merge}}$. - Merged concepts inherit the union of the original concepts' class sets. - Each class retains at most top-$K$ exclusive concepts (concepts associated with only one class).

Step 3: Concept Labeling - The concept label vector for each image is one-hot encoded: $s_{i,j} = 1$ if and only if $y_i \in C_j$ and $\bm{A}_{i,j} > \tau_{\text{conf}}$. - This yields a concept-annotated dataset $\mathcal{D}' = \{(\bm{x}_i, \bm{s}_i, y_i)\}_{i=1}^n$.

3. CEA Metric (Concept-Efficient Accuracy)¶

A principled, information-theoretic metric:

\[\text{CEA} = \frac{\text{ACC}}{(\log_k m)^\beta}\]

where: - $k = \lceil \log_2 l \rceil$: the theoretical minimum information (in bits) required to distinguish $l$ classes. - $m$: the number of concepts used. - $\beta \geq 0$: a temperature parameter (smaller values emphasize accuracy; larger values emphasize compactness).

Three desirable properties: - Optimal efficiency: CEA → 1 as ACC → 1 and $m$ → $k$. - Adaptive scaling: Logarithmic scaling with base $k$ adaptively adjusts the penalty according to task complexity. - Theoretical grounding: Aligned with Shannon information theory.

Loss & Training¶

Concept Bottleneck Layer (CBL) Training: - The backbone encoder $\bm{\phi}$ is frozen; a projection layer $\bm{g}: \mathbb{R}^d \to \mathbb{R}^{\hat{m}}$ is trained. - Loss function: Binary Cross-Entropy $$\min_{\bm{g}} \mathcal{L}_{\text{CBL}} = \frac{1}{n}\sum_{i=1}^n \text{BCE}(\bm{g}(\bm{\phi}(\bm{x}_i)), \bm{s}_i)$$

Final Classification Layer (FCL) Training: - A sparse linear classifier $\bm{f}: \mathbb{R}^{\hat{m}} \to \mathbb{R}^l$. - Loss function: Cross-Entropy + elastic-net regularization $$\min_{\bm{f}} \mathcal{L}_{\text{FCL}} = \frac{1}{n}\sum_{i=1}^n \text{CE}(\bm{f}(\hat{\bm{g}}(\bm{x}_i)), y_i) + \lambda R_\alpha(\bm{W}_F)$$ - Optimized using the GLM-SAGA optimizer.

Key Experimental Results¶

Main Results (11 Datasets, CLIP_RN50 Backbone)¶

Method	Avg. ACC (%) ↑	Avg. # Concepts ↓	Avg. CEA (%) ↑
LaBo	72.8	7,900	51.6
LF-CBM	72.9	718	55.2
LM4CV	73.4	873	56.4
DN-CBM	77.3	8,192	53.4
Res-CBM	71.8	291	56.7
VLG-CBM	75.2	732	57.0
V2C-CBM	72.8	7,500	51.2
DCBM	70.9	2,048	49.5
PS-CBM	78.3	545	59.0

PS-CBM surpasses prior SOTA by 1.0%–7.4% in average ACC and 2.0%–9.5% in CEA, while using only 545 concepts — 7,647 fewer than DN-CBM.

Ablation Study¶

Concept Strategy	ACC ↑	CEA ↑	Notes
Independent	Lower	Lower	High concept redundancy
Globally Shared	Medium	Medium	Unrelated classes forced to share concepts
Partially Shared	Highest	Highest	Selective sharing balances specificity and compactness

Confidence Threshold $\tau_{\text{conf}}$	Avg. ACC (%)	# Concepts	CEA (%)
0.10	76.20	548	57.41
0.15	76.14	548	57.36
0.20	78.35	545	59.02
0.25	72.71	458	55.16
0.30	57.55	145	46.84

Analysis of the number of exclusive concepts $K$: ACC increases sharply from $K=0$ to $K=1$ and stabilizes at $K \geq 2$; CEA peaks at $K=1$ and decreases thereafter.

CLIP Score Comparison (Domain-Specific Datasets)¶

Method	DTD	Resisc45	UCF101	Concept Generation	Concept Pool
LaBo	0.227	0.222	0.230	Language	Independent
DN-CBM	0.192	0.187	0.187	Vision	Globally Shared
V2C-CBM	0.246	0.216	0.247	Vision	Independent
PS-CBM	0.249	0.255	0.265	Language+Vision	Partially Shared

Key Findings¶

Decoupling accuracy from concept count: PS-CBM achieves the highest accuracy (78.3%) with the fewest concepts (545), demonstrating that concept quality matters more than quantity.
Partial sharing is optimal: The partially shared strategy finds the best balance between the independent and globally shared extremes — selectively sharing semantically similar concepts reduces redundancy while preserving discriminability.
$K=1$ is most efficient: A single exclusive concept per class suffices to achieve the best CEA; additional class-specific concepts reduce efficiency.
Advantage of multimodal concept generation: Combining LLM semantics with visual exemplars yields significantly higher CLIP Scores on domain-specific datasets compared to unimodal methods.
Semantic consistency in concept–class mapping: Visualizations on CIFAR10 confirm that shared concepts correctly reflect semantic relationships (e.g., "hooves" and "long neck" are shared between deer and horse).

Highlights & Insights¶

Elegant design of partial sharing: The greedy merging algorithm based on activation patterns is concise and effective — selecting the maximally mergeable set, inheriting the union of class sets, and capping the number of exclusive concepts.
Theoretical grounding of CEA: The metric is derived from Shannon information theory with a well-defined upper bound (→1) and lower bound, offering greater theoretical elegance than existing metrics (CUE, NEC).
Experimental breadth: Evaluation across 11 datasets spanning general, fine-grained, and domain-specific tasks ensures strong generalizability of the conclusions.
Open-source release: Complete reproduction code and configurations are publicly available.

Limitations & Future Work¶

Dependence on CLIP encoder: The quality of concept filtering and labeling is bounded by CLIP's alignment capability.
Cost of GPT-4o: Concept generation requires querying GPT-4o, which incurs non-trivial costs at scale.
Hyperparameter sensitivity: The choices of $\tau_{\text{conf}}$ and $\tau_{\text{merge}}$ substantially affect performance and require careful tuning.
ImageNet subsampling: Due to dataset scale, only 10% of training images are used during merging, potentially affecting concept quality.
Static concept set: The current framework does not support dynamic concept updates based on feedback.

vs. LF-CBM: LF-CBM employs a globally shared pool but suffers from concept redundancy; PS-CBM's partially shared strategy achieves a better balance between compactness and discriminability.
vs. DN-CBM: DN-CBM discovers visual concepts via sparse autoencoders and achieves high accuracy but requires 8,192 concepts; PS-CBM surpasses it with only 545.
CEA vs. CUE: CUE lacks a clear upper bound and is sensitive to text format; CEA is posterior, task-adaptive, and training-agnostic.

Rating¶

Novelty: ⭐⭐⭐⭐ — The partially shared strategy and CEA metric are well-motivated but not paradigm-shifting innovations.
Experimental Thoroughness: ⭐⭐⭐⭐⭐ — 11 datasets, 8 baselines, and multi-dimensional ablations; highly comprehensive.
Writing Quality: ⭐⭐⭐⭐⭐ — Clear structure, rich tables, excellent visualizations; the comparison matrix in Table 1 is particularly intuitive.
Value: ⭐⭐⭐⭐ — Makes a substantive contribution to the CBM field; the CEA metric has strong potential for broad adoption.