Unlearning Evaluation through Subset Statistical Independence¶
Paper Information¶
- Conference: ICLR 2026
- arXiv: 2603.00587
- Code: https://github.com/ChildEden/SDE
- Area: Machine Unlearning / Privacy Protection / Statistical Testing
- Keywords: Machine unlearning evaluation, HSIC, statistical independence, subset-level evaluation, membership inference
TL;DR¶
This paper proposes Split-half Dependence Evaluation (SDE), which leverages HSIC-based statistical independence testing to evaluate machine unlearning at the subset level, requiring neither model retraining nor auxiliary classifiers.
Background & Motivation¶
Core Problem¶
How can one verify whether a machine unlearning procedure has succeeded? Existing evaluation methods exhibit fundamental limitations:
Retraining Comparison: Requires training a new reference model — contradicting the original motivation for unlearning.
Membership Inference Attacks (MIA): Rely on training statistics and shadow models — difficult to obtain after unlearning.
Sample-level Inference: Since unlearning removes only a small subset (5%–20%), per-sample signals are statistically weak after unlearning.
Paradigm Shift¶
From sample-level MIA → subset-level statistical independence evaluation.
Core intuition: Training participation induces inter-sample dependencies in model outputs (via shared gradient updates and co-adaptation), whereas out-of-training data exhibits no such dependencies.
Method¶
Split-half Dependence Evaluation (SDE)¶
Core Idea¶
The subset under evaluation \(\mathcal{S}\) is randomly split into two halves \(\mathcal{S}_1, \mathcal{S}_2\), and the statistical dependence between model outputs is computed as:
- In-training subsets: \(H(\mathcal{S}_{IT}, h)\) is significantly higher than
- Out-of-training subsets: \(H(\mathcal{S}_{OOT}, h)\)
HSIC (Hilbert-Schmidt Independence Criterion)¶
where \(K, L\) are Gaussian RBF kernel matrices and \(H = I - \frac{1}{n}\mathbf{1}\mathbf{1}^T\) is the centering matrix.
The distribution of \(H(\mathcal{S}, h)\) is estimated via 200 random shuffles of \(\mathcal{S}_2\).
Unlearning Evaluation Protocol¶
Given a target subset \(\mathcal{S}_{\text{tar}} \subseteq \mathcal{D}_f\), a reference in-training set \(\mathcal{S}_{IT} \subset \mathcal{D}_r\), and an out-of-training set \(\mathcal{S}_{OOT} \subset \mathcal{D}_{te}\):
Unlearning is deemed successful if and only if: $\(D(\mathcal{S}_{\text{tar}}, \mathcal{S}_{OOT}, h^{un}) < D(\mathcal{S}_{\text{tar}}, \mathcal{S}_{IT}, h^{un})\)$
where \(D\) denotes the Jensen-Shannon divergence between dependence distributions.
Theoretical Analysis¶
Shared influence components introduced during training cause in-training subsets to exhibit stronger split-half dependence. Specifically, when \(h = \mathcal{A}(\mathcal{D}_{tr})\), the output \(h(x_i)\) implicitly depends on \(x_j\) through the learned parameters, so \(h(x_i)\) and \(h(x_j)\) are no longer independent.
Experiments¶
Controlled Experiments (Retrained Model)¶
| Dataset-Model | R=5% |S|=400 | R=10% |S|=1000 | R=20% |S|=2000 | |------------|------|--------|--------| | SV-ResNet18 | 0.71 | 0.78 | 0.97 | | C10-ResNet18 | 0.87 | 0.95 | 1.00 | | C100-ResNet18 | 0.99 | 1.00 | 1.00 | | Tiny-ResNet18 | 0.70 | 0.92 | 0.98 |
Comparison with Distribution Distance Metrics (CIFAR10-ResNet18, R=10%, |S|=1000)¶
| Method | F1 Score |
|---|---|
| MMD | 0.70 |
| Wasserstein | 0.89 |
| SDE (Ours) | 0.95 |
SDE consistently outperforms MMD and Wasserstein across all settings, with a particularly pronounced advantage on smaller subsets.
Evaluation of Unlearning Methods (CIFAR10-ResNet18, R=10%)¶
| Method | Acc_r(%) | Acc_f(%) | ASR | OTR↑(%) |
|---|---|---|---|---|
| Retrain | 98.57 | 93.25 | 0.30 | 87.00 |
| RandLabel | 98.80 | 98.63 | 0.29 | 84.00 |
| Unroll | 99.36 | 99.21 | 0.30 | 3.00 |
| Sparsity | 92.72 | 90.56 | 0.42 | 50.80 |
| SalUn | 98.66 | 98.53 | 0.29 | 52.40 |
Key Findings¶
- Critical finding on Unroll: Conventional metrics (ASR ≈ 0.30, consistent with retraining) indicate successful unlearning, yet SDE yields an OTR of only 3% — nearly all forget samples are still identified as in-training data.
- SDE exposes the insufficiency of MIA: Similar ASR values make it difficult to distinguish unlearning quality, whereas OTR provides clearer differentiation.
- Larger subsets and deeper-layer features yield better discriminative power.
- Kernel bandwidth \(\sigma = \sqrt{\text{dim}}\) serves as a robust heuristic.
- Dependencies can be detected even in models trained for only 20% of the full training schedule.
Highlights & Insights¶
- Retraining-free independent evaluation: A genuinely independent verification framework for machine unlearning.
- Subset-level evaluation aligned with the unlearning workflow: Unlearning is inherently a subset-level operation.
- Exposing blind spots in existing evaluation: The Unroll case serves as a cautionary example.
- Unified theory and practice: The shared influence component analysis provides theoretical grounding for the method design.
Limitations & Future Work¶
- The choice of kernel bandwidth \(\sigma\) has a notable impact; the simple heuristic may not generalize to all scenarios (e.g., diffusion models).
- Reference set selection affects performance, and an optimal reference set construction strategy remains unresolved.
- The method may capture natural forgetting (representation drift, catastrophic forgetting) rather than intentional unlearning.
- The current framework yields only binary judgments, leaving the potential of HSIC as a continuous metric underexplored.
- Performance is weaker on shallow architectures such as AllCNN.
Related Work & Insights¶
- Machine Unlearning: SISA, Random-label, SalUn — representative unlearning algorithms.
- Membership Inference Attacks: Confidence-based, loss-based, and auxiliary-classifier-based methods.
- Statistical Independence Testing: HSIC, MMD — kernel-based statistical tests.
Rating¶
- Novelty: ⭐⭐⭐⭐ — Subset-level statistical independence evaluation is a novel perspective.
- Experimental Thoroughness: ⭐⭐⭐⭐ — Multi-dimensional controlled experiments and evaluation across unlearning methods.
- Writing Quality: ⭐⭐⭐⭐ — Clear motivation and complete method description.
- Value: ⭐⭐⭐⭐ — No additional training required; easy to deploy.