Skip to content

MAVias: Mitigate Any Visual Bias

Conference: ICCV2025 arXiv: 2412.06632 Code: https://github.com/gsarridis/VB-Mitigator (VB-Mitigator library) Area: Multimodal VLM / Bias Mitigation Keywords: Visual Bias Mitigation, Open-Set Bias, Foundation Models, Vision-Language Models, Fairness

TL;DR

This paper proposes MAVias, an open-set visual bias mitigation framework that extracts visual attribute tags from images using a tagging foundation model, employs an LLM to filter out tags irrelevant to the target class as potential biases, encodes the identified biases via vision-language embeddings, and incorporates them into training to learn bias-invariant representations. MAVias substantially outperforms existing methods on CelebA, Waterbirds, UrbanCars, and ImageNet9.

Background & Motivation

Background: Deep learning models are prone to learning spurious correlations present in training data — for example, waterbirds consistently appearing against aquatic backgrounds, or blonde hair being predominantly associated with female subjects. Existing bias mitigation methods fall into two categories: Bias-Aware (BA) methods, which require annotated bias attributes, and Bias-Unaware (BU) methods, which derive pseudo-labels by training a bias proxy model.

Limitations of Prior Work: - BA methods rely on predefined, known bias labels and thus cannot scale to large-scale general-purpose datasets (e.g., ImageNet), where biases are diverse and unknown. - BU methods are effective only when bias is extremely salient (sufficient to train a proxy model) and cannot handle multi-attribute or unknown biases. - Neither category generalizes to open-set scenarios, where bias types are unknown in advance and their quantity is indeterminate.

Key Challenge: In real-world settings, bias operates at the instance level — each image may exhibit a distinct combination of task-irrelevant attributes — whereas existing methods are designed for dataset-level, single or few known biases.

Goal: To automatically discover and mitigate an arbitrary number and type of visual biases in images without any predefined bias specification.

Key Insight: The paper leverages the complementary capabilities of foundation models — an image tagging model, an LLM, and a vision-language model — to automatically extract visual attributes, assess their relevance to the target class, and encode irrelevant attributes as bias signals for training.

Core Idea: Foundation models are used to automatically discover instance-level open-set visual biases; the biases are encoded as vision-language embeddings and integrated into training via logit fusion to achieve bias-invariant learning.

Method

Overall Architecture

MAVias consists of two stages: (1) Bias Modeling: for each training image, descriptive tags are extracted → an LLM filters out irrelevant tags → a vision-language model encodes the resulting biases; (2) Bias Mitigation Training: the main model extracts image features and computes main logits; a projection layer maps bias embeddings into the same feature space to produce bias logits; the two are summed to form the final prediction, and gradient modulation causes the model to disregard bias features.

Key Designs

  1. Language-driven Bias Modeling:

    • Function: Automatically identifies visual attributes that are irrelevant to the target class for each training image.
    • Mechanism: A three-step pipeline — (a) the Recognize Anything Model (RAM, with a vocabulary of 4,000+ tags) is applied to extract a descriptive tag set \(\mathcal{T}^{(i)}\) per image; (b) GPT-4o determines whether each tag is semantically related to the target class \(y^{(i)}\), yielding an irrelevant subset \(\mathcal{B}^{(i)} \subseteq \mathcal{T}^{(i)}\); (c) OpenCLIP encodes all irrelevant tags as a unified embedding \(\mathbf{e}^{(i)} \in \mathbb{R}^d\) via the prompt "a photo of \(t_1, t_2, ..., t_k\)".
    • Design Motivation: (1) RAM covers 4,000+ visual concepts, satisfying open-set requirements; (2) LLMs possess commonsense reasoning to assess semantic relevance between tags and categories; (3) aggregating all irrelevant tags into a single embedding rather than processing each tag individually reduces computational overhead.
  2. Bias Mitigation Training:

    • Function: Trains the main model to learn bias-invariant feature representations.
    • Mechanism: The main model \(f_\theta\) produces feature \(\mathbf{h}^{(i)}\) and main logits \(\mathbf{z}_{\text{main}}^{(i)}\). A projection layer \(g_\phi\) maps the bias embedding \(\mathbf{e}^{(i)}\) into the main model's feature space, after which a classification head yields bias logits \(\mathbf{z}_{\text{tag}}^{(i)}\). The final logits are \(\mathbf{z}^{(i)} = \mathbf{z}_{\text{main}}^{(i)} + \mathbf{z}_{\text{tag}}^{(i)}\).
    • Design Motivation: For highly bias-aligned samples, \(\mathbf{z}_{\text{tag}}\) is large, which reduces the relative contribution of \(\mathbf{z}_{\text{main}}\) to the total logits and thereby diminishes the gradient updates for such samples — implicitly discouraging the model from relying on bias features.
  3. Logit Alignment Loss:

    • Function: Balances the training of the main model and the projection layer to prevent either from dominating.
    • Mechanism: The overall loss is \(\mathcal{L} = \mathcal{L}_{cls}(\mathbf{z}^{(i)}, y^{(i)}) + \alpha \cdot \mathcal{L}_{align}\), where the alignment term is \(\mathcal{L}_{align} = \frac{1}{2} \| \|\mathbf{z}_{\text{main}}^{(i)}\| - \lambda \cdot \|\mathbf{z}_{\text{tag}}^{(i)}\| \|^2\).
    • Design Motivation: \(\lambda \in (0,1)\) controls the relative magnitude of bias logits with respect to main logits; a smaller \(\lambda\) is appropriate for stronger biases, producing smaller gradients for bias-aligned samples. \(\alpha\) balances the classification and alignment losses.

Loss & Training

SGD is used as the optimizer (Adam for CelebA), with a learning rate of 0.001 decayed by a factor of 10 every one-third of an epoch. Hyperparameters \((\alpha, \lambda)\) are tuned separately for each dataset. At inference, only the main model \(f_\theta\) is used; the projection layer \(g_\phi\) is discarded, incurring no additional inference overhead.

Key Experimental Results

Main Results (Open-Set Evaluation)

Dataset Metric MAVias JTT (2nd best) LfF Gain (vs. 2nd best)
CelebA WG Acc 66.7% 31.5% 14.7% +35.2%
CelebA Avg Acc 81.4% 61.6% 67.1% +14.0%
Waterbirds WG Acc 75.4% 64.7% 30.0% +10.7%
Waterbirds Avg Acc 87.5% 85.2% 72.7% +2.3%
UrbanCars WG Acc 84.4% 69.0% 34.6% +15.4%
UrbanCars Avg Acc 89.3% 77.8% 61.0% +11.5%
ImageNet9 MIXED-NEXT Acc 88.26% 87.56% 78.70% +0.70%
ImageNet9 NO-FG Acc 53.02% 59.84% 61.07% −6.82% (↓ better)
ImageNet9 ONLY-BG-B Acc 21.83% 29.71% 34.82% −7.88% (↓ better)

Ablation Study (Bias Detection Effectiveness)

Dataset Top Detected Bias Tags Consistent with Known Biases
CelebA man, woman, suit, tie, dress ✓ Gender bias recovered + additional biases found
Waterbirds background (water, bamboo, branch) ✓ Background bias precisely captured
UrbanCars path, forest, hydrant, park ✓ Urban/rural background bias captured
ImageNet9 10 irrelevant tags per class Newly discovered (color, texture, background)

Key Findings

  • Dominant advantage in open-set settings: Existing BU methods (LfF, JTT, Debian, FLAC-B) perform poorly in multi-bias scenarios, while MAVias achieves substantial gains across all datasets.
  • Greatly reduced background dependency on ImageNet9: On the ONLY-BG-B test set (background only), MAVias reduces accuracy from 35.18% (vanilla) to 21.83%, indicating the model no longer relies on background cues for prediction.
  • Bias discovery beyond predefined attributes: On CelebA, MAVias not only recovers the known gender bias but also identifies novel bias sources such as clothing items (suit, tie).

Highlights & Insights

  • Effective composition of foundation models: RAM, GPT-4o, and OpenCLIP each serve a distinct role, forming a complete pipeline from visual feature extraction → semantic filtering → multimodal encoding. This "foundation model toolchain" paradigm is transferable to many tasks requiring open-set understanding.
  • Instance-level bias modeling: Unlike conventional approaches that define bias at the dataset level (e.g., gender in CelebA), MAVias constructs an independent bias set for each image, enabling the handling of complex multi-attribute bias scenarios.
  • Zero inference overhead: The bias projection layer is used only during training; only the main model is required at inference, adding neither computation nor parameters.

Limitations & Future Work

  • Dependence on GPT-4o for tag filtering: Tag relevance judgments rely on LLM commonsense reasoning, which is susceptible to errors. The effectiveness of alternative LLMs or open-source substitutes remains unexplored.
  • Limited RAM vocabulary: Although 4,000+ tags provide broad coverage, fine-grained biases may still be missed.
  • Hyperparameter sensitivity: \((\alpha, \lambda)\) require per-dataset tuning, increasing the barrier to adoption.
  • Not validated on large-scale generative tasks: Evaluation is limited to classification; effectiveness in detection, segmentation, and generation tasks remains unknown.
  • vs. LfF/JTT: These BU methods obtain pseudo-labels by training a bias proxy model and can only handle a single salient bias. MAVias leverages foundation models to directly discover multi-attribute biases without training a proxy model.
  • vs. FLAC: FLAC requires indirect access to bias labels and remains constrained to predefined biases. MAVias is fully open-set and requires no prior knowledge of biases.
  • vs. OpenBias: OpenBias performs open-set bias detection in text-to-image generation but relies solely on textual descriptions, lacking visual grounding. MAVias begins with image-level tag extraction and applies LLM filtering, yielding stronger visual grounding.

Rating

  • Novelty: ⭐⭐⭐⭐⭐ First open-set visual bias mitigation framework; creatively combines multiple foundation models.
  • Experimental Thoroughness: ⭐⭐⭐⭐ Evaluated on 4 datasets under both open-set and closed-set protocols; lacks validation across additional task types.
  • Writing Quality: ⭐⭐⭐⭐ Problem motivation is clearly articulated; methodological intuition is well explained.
  • Value: ⭐⭐⭐⭐⭐ Open-set bias mitigation is an important and underexplored direction with high practical value.