Order-Robust Class Incremental Learning: Graph-Driven Dynamic Similarity Grouping¶

Conference: CVPR 2025
arXiv: 2502.20032
Code: https://github.com/AIGNLAI/GDDSG
Area: LLM Security
Keywords: Class Incremental Learning, Graph Coloring, Order Robustness, Similarity Grouping, NCM Classifier

TL;DR¶

GDDSG is proposed, which groups classes by similarity using graph coloring theory—ensuring classes within the same group are as dissimilar as possible (to reduce interference). Each group is learned independently using an NCM classifier and a LoRA adapter, achieving a 94.00% accuracy and only a 0.78% forgetting rate on CIFAR-100 10-step (prev. SOTA RanPAC: 90.50% / 3.49%).

Background & Motivation¶

Background¶

Background: Class Incremental Learning (CIL) requires the model to continuous learn new classes without forgetting old ones. The performance of existing methods highly depends on the arrival order of classes—performing well under some orders but dropping significantly under others.

Limitations of Prior Work: Order sensitivity is a core challenge in CIL but has been largely overlooked by most studies. Theoretical analysis (Corollary 1) proves that: higher similarity between classes \(\to\) more forgetting + stronger order sensitivity.

Key Challenge: Continuously arriving similar classes (e.g., learning "cat" and then "tiger") cause severe representation confusion and forgetting, yet the arrival order of classes is uncontrollable.

Key Insight: Instead of altering the arrival order of classes, similar classes are partitioned into different groups during learning—ensuring classes within each group are as dissimilar as possible. Graph coloring theory is used to guarantee grouping optimality.

Core Idea: Construct similarity graph \(\to\) Welsh-Powell coloring \(\to\) dissimilar classes grouped together \(\to\) independent NCM + adapters = order-insensitive incremental learning.

Solution Approach¶

Goal: ### Key Designs

SimGraph + Graph Coloring: Construct a class similarity graph using the L2 distance of pre-trained features, with an adaptive threshold \(\eta_{i,j}\) determining whether to connect edges.

Method¶

Key Designs¶

SimGraph + Graph Coloring: Construct a class similarity graph using the L2 distance of pre-trained features, with an adaptive threshold \(\eta_{i,j}\) determining whether to connect edges. The Welsh-Powell coloring algorithm with \(O(|V|^2)\) complexity partitions classes into color groups—ensuring classes within the same color group are dissimilar.
Independent Intra-group Learning: Each group uses an independent NCM classifier + LoRA adapter. They do not interfere with each other—eliminating intra-group class interference.
Meta-feature Inference: During inference, a meta-classifier (an ensemble of RandomForest + KNN + LightGBM) is first used to identify which group the sample belongs to, followed by classification using the intra-group NCM.

Loss & Training¶

Regularized least squares regression \(\mathcal{L} = \|Y - H\Theta\|_F^2 + \lambda\|\Theta\|_F^2\), where \(\lambda\) is cross-validated on a calibration set. The backbone is frozen (ViT-B/16).

Key Experimental Results¶

Dataset (10-step)	GDDSG	RanPAC	Forgetting Rate
CIFAR-100	94.00%	90.50%	0.78%
CUB-200	91.64%	89.23%	1.92%
Dogs	92.64%	85.37%	1.42%

Ablation Study¶

No grouping \(\to\) accuracy drops to 91-92%—grouping contributes 2-3%
The probability of satisfying Brooks' theorem is \(>0.99\) when \(N > 35\)
The grouping generated by graph coloring is optimal in polynomial time

Key Findings¶

Order Robustness: The performance variance of GDDSG is far smaller than the baselines—exhibiting consistent performance under different orders.
Extremely Low Forgetting Rate: 0.78% vs RanPAC 3.49%—dissimilarity within groups guarantees less interference.
Theoretical Guarantees: Corollary 1 proves that reducing intra-group similarity simultaneously mitigates expected forgetting and order sensitivity.

Highlights & Insights¶

Cross-domain Innovation of Graph Coloring \(\times\) CIL—solving order sensitivity in deep learning with a classic tool from combinatorial optimization.
Theory-driven Design—directly deriving the methodology from Corollary 1.

Limitations & Future Work¶

Reliance on a frozen ViT-B/16 backbone.
Meta-classifier increases inference complexity.
Gram matrix and prototype matrix scale with the number of classes.

Rating¶

Novelty: ⭐⭐⭐⭐⭐ Innovative combination of graph coloring + CIL, theory-driven
Experimental Thoroughness: ⭐⭐⭐⭐ 4 datasets, order sensitivity analysis
Writing Quality: ⭐⭐⭐⭐ Clear theory
Value: ⭐⭐⭐⭐⭐ Expected to become the new default framework for CIL