Skip to content

Order-Robust Class Incremental Learning: Graph-Driven Dynamic Similarity Grouping

Conference: CVPR 2025
arXiv: 2502.20032
Code: https://github.com/AIGNLAI/GDDSG
Area: LLM Security
Keywords: Class Incremental Learning, Graph Coloring, Order Robustness, Similarity Grouping, NCM Classifier

TL;DR

GDDSG is proposed, which groups classes by similarity using graph coloring theory—ensuring classes within the same group are as dissimilar as possible (to reduce interference). Each group is learned independently using an NCM classifier and a LoRA adapter, achieving a 94.00% accuracy and only a 0.78% forgetting rate on CIFAR-100 10-step (prev. SOTA RanPAC: 90.50% / 3.49%).

Background & Motivation

Background

Background: Class Incremental Learning (CIL) requires the model to continuous learn new classes without forgetting old ones. The performance of existing methods highly depends on the arrival order of classes—performing well under some orders but dropping significantly under others.

Limitations of Prior Work: Order sensitivity is a core challenge in CIL but has been largely overlooked by most studies. Theoretical analysis (Corollary 1) proves that: higher similarity between classes \(\to\) more forgetting + stronger order sensitivity.

Key Challenge: Continuously arriving similar classes (e.g., learning "cat" and then "tiger") cause severe representation confusion and forgetting, yet the arrival order of classes is uncontrollable.

Key Insight: Instead of altering the arrival order of classes, similar classes are partitioned into different groups during learning—ensuring classes within each group are as dissimilar as possible. Graph coloring theory is used to guarantee grouping optimality.

Core Idea: Construct similarity graph \(\to\) Welsh-Powell coloring \(\to\) dissimilar classes grouped together \(\to\) independent NCM + adapters = order-insensitive incremental learning.

Solution Approach

Goal: ### Key Designs

  1. SimGraph + Graph Coloring: Construct a class similarity graph using the L2 distance of pre-trained features, with an adaptive threshold \(\eta_{i,j}\) determining whether to connect edges.

Method

Key Designs

  1. SimGraph + Graph Coloring: Construct a class similarity graph using the L2 distance of pre-trained features, with an adaptive threshold \(\eta_{i,j}\) determining whether to connect edges. The Welsh-Powell coloring algorithm with \(O(|V|^2)\) complexity partitions classes into color groups—ensuring classes within the same color group are dissimilar.

  2. Independent Intra-group Learning: Each group uses an independent NCM classifier + LoRA adapter. They do not interfere with each other—eliminating intra-group class interference.

  3. Meta-feature Inference: During inference, a meta-classifier (an ensemble of RandomForest + KNN + LightGBM) is first used to identify which group the sample belongs to, followed by classification using the intra-group NCM.

Loss & Training

Regularized least squares regression \(\mathcal{L} = \|Y - H\Theta\|_F^2 + \lambda\|\Theta\|_F^2\), where \(\lambda\) is cross-validated on a calibration set. The backbone is frozen (ViT-B/16).

Key Experimental Results

Dataset (10-step) GDDSG RanPAC Forgetting Rate
CIFAR-100 94.00% 90.50% 0.78%
CUB-200 91.64% 89.23% 1.92%
Dogs 92.64% 85.37% 1.42%

Ablation Study

  • No grouping \(\to\) accuracy drops to 91-92%—grouping contributes 2-3%
  • The probability of satisfying Brooks' theorem is \(>0.99\) when \(N > 35\)
  • The grouping generated by graph coloring is optimal in polynomial time

Key Findings

  • Order Robustness: The performance variance of GDDSG is far smaller than the baselines—exhibiting consistent performance under different orders.
  • Extremely Low Forgetting Rate: 0.78% vs RanPAC 3.49%—dissimilarity within groups guarantees less interference.
  • Theoretical Guarantees: Corollary 1 proves that reducing intra-group similarity simultaneously mitigates expected forgetting and order sensitivity.

Highlights & Insights

  • Cross-domain Innovation of Graph Coloring \(\times\) CIL—solving order sensitivity in deep learning with a classic tool from combinatorial optimization.
  • Theory-driven Design—directly deriving the methodology from Corollary 1.

Limitations & Future Work

  • Reliance on a frozen ViT-B/16 backbone.
  • Meta-classifier increases inference complexity.
  • Gram matrix and prototype matrix scale with the number of classes.

Rating

  • Novelty: ⭐⭐⭐⭐⭐ Innovative combination of graph coloring + CIL, theory-driven
  • Experimental Thoroughness: ⭐⭐⭐⭐ 4 datasets, order sensitivity analysis
  • Writing Quality: ⭐⭐⭐⭐ Clear theory
  • Value: ⭐⭐⭐⭐⭐ Expected to become the new default framework for CIL