Understanding and Enforcing Weight Disentanglement in Task Arithmetic¶

Conference: CVPR 2026 arXiv: 2604.17078 Code: GitHub Area: Model Compression Keywords: Task Arithmetic, Model Merging, Weight Disentanglement, Orthogonal Regularization, Task Vectors

TL;DR¶

This paper proposes Task Feature Specialization (TFS) as a sufficient condition for weight disentanglement, reveals that its geometric consequence is weight vector orthogonality, and introduces OrthoReg — a regularization method that enforces column-wise orthogonality of weight update matrices during fine-tuning to promote task vector disentanglement, substantially improving the performance of various task arithmetic methods.

Background & Motivation¶

Background: Task Arithmetic is an efficient training-free model editing paradigm that computes task vectors \(\tau_t = \theta_t^* - \theta_0\) (the difference between fine-tuned and pre-trained weights) and applies algebraic operations (addition, subtraction) to compose, remove, or analogize different skills.
Limitations of Prior Work: Although Task Arithmetic is effective in practice, a fundamental theoretical explanation remains lacking. The existing notion of "weight disentanglement" (introduced by TTA) describes the desired outcome — that the effects of different task vectors do not interfere with one another — but does not reveal its root cause. Specifically, the intrinsic properties required of the pre-trained model \(\theta_0\) or the task vectors \(\tau_t\) to achieve disentanglement have not been adequately explored.
Key Challenge: Weight disentanglement is a phenomenological description rather than a causal explanation. Existing methods are either computationally expensive (e.g., TTA requires Jacobian computation) or lack theoretical guarantees for reliably producing high-quality task vectors.
Goal: To answer two core questions: (1) What properties of a pre-trained model make it suitable for task arithmetic? (2) How can task vectors be constructed to actively promote weight disentanglement?
Key Insight: By examining the internal feature allocation mechanism of the model, the paper identifies Task Feature Specialization as a sufficient condition for disentanglement, with weight vector orthogonality as its observable geometric consequence.
Core Idea: TFS is abstract and cannot be directly enforced; however, its geometric consequence — orthogonality — is concrete and actionable. By enforcing an orthogonal internal structure on the weight update matrix during fine-tuning, weight disentanglement can be indirectly promoted.

Method¶

Overall Architecture¶

Given a pre-trained model \(\theta_0\) and multiple downstream tasks, each task is fine-tuned independently with an orthogonal regularization term appended to the standard task loss, constraining the column vectors of the weight update matrix \(\Delta W\) to be mutually orthogonal. After fine-tuning, the models are merged via standard task arithmetic (\(\theta_{MT} = \theta_0 + \sum \alpha_t \tau_t\)) to obtain a multi-task model.

Key Designs¶

Task Feature Specialization (TFS) Theory:
- Function: Provides a fundamental theoretical explanation for the success of task arithmetic.
- Mechanism: Defines Task Feature Specialization — the ability of a model to assign distinct internal features (column vectors of weight matrices) to different tasks. Formally, the specialized feature set \(I_t\) for task \(t\) is the set of feature indices to whose activations \(z_k\) the model output is sensitive. TFS requires that feature sets of different tasks be disjoint (\(I_t \cap I_j = \emptyset\)). The paper proves that TFS is a sufficient condition for weight disentanglement: under the NTK linearization assumption, TFS guarantees that the interference term \(\tau_j^\top J(x) = 0\) holds for all \(x \in \mathcal{D}_t\). It further proves that TFS naturally induces block orthogonality in the weight matrix.
- Design Motivation: TFS is positioned as the common cause linking a functional property (weight disentanglement) and a geometric property (orthogonality), providing theoretical guidance for method design.
OrthoReg Regularization:
- Function: Actively promotes weight disentanglement during fine-tuning.
- Mechanism: An orthogonal regularization term is appended to the standard fine-tuning loss: \(\mathcal{L} = \mathcal{L}_{\text{task}}(\theta_0 + \Delta\theta) + \lambda \cdot \mathcal{L}_{\text{ortho}}(\Delta\theta)\), where \(\mathcal{L}_{\text{ortho}} = \sum_l \|(\Delta W^{(l)})^\top \Delta W^{(l)} - I\|_F^2\). This regularization term penalizes the deviation of the Gram matrix of each update matrix from the identity, driving the column vectors of \(\Delta W\) toward mutual orthogonality with unit norms. The paper theoretically demonstrates that OrthoReg promotes disentanglement through a dual control mechanism: (1) norm control — bounding \(\|\tau_j\|_2\); and (2) angular control — driving the angle between different task vectors toward 90°.
- Design Motivation: TFS is an idealized property; in practice, feature sets overlap and directly enforcing TFS is infeasible. OrthoReg instead enforces the geometric consequence (orthogonality) to indirectly achieve disentanglement. As a simple plug-and-play regularizer, it is compatible with any fine-tuning method.
Theoretical Unification with TTA:
- Function: Reveals the common mechanism underlying the success of different methods.
- Mechanism: The paper proves that OrthoReg and TTA (Tangent Task Arithmetic), despite their different implementations, both promote disentanglement by achieving orthogonality between task vectors (\(\langle \tau_t, \tau_j \rangle \approx 0\)). TTA achieves this implicitly via the NTK geometry of the model but at high computational cost (doubled memory, 2–3× training time). OrthoReg achieves this explicitly through a regularization term, making it more direct and efficient.
- Design Motivation: A unified theoretical perspective aids in understanding the essence of existing methods and guides the design of future approaches.

Loss & Training¶

The total loss is \(\mathcal{L} = \mathcal{L}_{\text{task}} + \lambda \cdot \mathcal{L}_{\text{ortho}}\), where \(\lambda\) is selected via a validation set within the range \([0.1, 100]\). During training, the text encoder is frozen and only the image encoder is updated. At merge time, a uniform scaling coefficient \(\alpha\) is used, selected by grid search over \(\{0.0, 0.05, \ldots, 1.0\}\).

Key Experimental Results¶

Main Results¶

Task Addition (8 tasks, ViT-L-14):

Method	Absolute Acc.	Normalized Acc.	Gain
Non-lin. FT	84.07%	89.19%	—
Non-lin. FT + OrthoReg	88.23%	100.08%	+4.16
TTA	86.19%	93.14%	—
TTA + OrthoReg	87.52%	96.44%	+1.33
ATT-FT	87.81%	93.59%	—
ATT-FT + OrthoReg	90.41%	100.05%	+2.60

Task Negation (forgetting the target task, ViT-L-14):

Method	Target Acc.↓	Control Acc.↑	Forgetting Gain
ATT-FT	24.85%	76.42%	—
ATT-FT + OrthoReg	14.67%	75.40%	−10.18

Ablation Study¶

Configuration	Absolute Acc.	Notes
ATT-FT + OrthoReg (ViT-L-14)	90.41%	Full method
ATT-FT (no regularization)	87.81%	−2.6% without OrthoReg
LoRA-ATT + OrthoReg	89.16%	Effective with PEFT as well
LoRA-ATT (no regularization)	87.02%	−2.14% without OrthoReg

Key Findings¶

Normalized accuracy exceeding 100%: Non-lin. FT + OrthoReg achieves 100.08% on ViT-L-14, meaning the merged multi-task model matches or surpasses 8 independently fine-tuned models, realizing near-ideal weight disentanglement.
Cosine similarity between task vectors substantially reduced: OrthoReg drives the cosine similarity between different task vectors close to 0, directly validating the theoretically predicted "angular control" mechanism.
Robustness to hyperparameters: Performance improves steadily as \(\lambda\) increases, and the method consistently outperforms baselines across a wide range of \(\alpha\) values.

Highlights & Insights¶

The causal chain TFS → WVO → WD is particularly elegant: identifying Task Feature Specialization as the common cause connecting functional and geometric properties provides a paradigm for bridging abstract principles to actionable constraints. This reasoning strategy — "if the direct cause cannot be enforced, enforce its consequence" — is broadly transferable.
Normalized accuracy exceeding 100% is the most striking result: it demonstrates that orthogonal constraints not only reduce inter-task interference but enable the merged model to surpass individual models, suggesting that the regularization effect yields additional benefits.
The simplicity of OrthoReg is commendable: a single regularization term \(\|(\Delta W)^\top \Delta W - I\|_F^2\) requires no architectural modifications or changes to the inference pipeline and can be directly integrated into any fine-tuning pipeline.

Limitations & Future Work¶

The theoretical analysis relies on the NTK linearization assumption; applicability to deep nonlinear networks warrants further investigation.
Experiments are conducted solely on CLIP-based ViTs; validation on other pre-training paradigms (e.g., MAE, DINOv2) is absent.
Only 8 classification tasks are considered; performance on a larger number of tasks (e.g., 20+) or heterogeneous task types (detection, segmentation) remains unverified.
Orthogonality constraints may be overly restrictive when the number of columns \(d\) greatly exceeds the number of rows \(m\), potentially limiting model expressiveness.
Future work could explore adaptive orthogonal constraints that adjust constraint strength according to task similarity.

vs. TTA (Tangent Task Arithmetic): TTA implicitly achieves task vector orthogonality via tangent-space linearization, but incurs substantial computational overhead (2–3× training time). OrthoReg explicitly enforces orthogonality and is more efficient; the two methods arrive at the same destination by different routes.
vs. TIES-Merging / DARE: These are during-merging methods that reduce interference through pruning or sign voting. OrthoReg is a pre-merging method that generates high-quality task vectors at the source; the two approaches are complementary.

Rating¶

Novelty: ⭐⭐⭐⭐⭐ The theoretical chain TFS → orthogonality → disentanglement is novel and complete; OrthoReg is elegantly designed.
Experimental Thoroughness: ⭐⭐⭐⭐ Comprehensive comparisons across three model scales and multiple baselines, though task types are limited.
Writing Quality: ⭐⭐⭐⭐⭐ The derivation logic from theory to method to experiments is clear and coherent throughout.
Value: ⭐⭐⭐⭐⭐ Provides a deep theoretical foundation for task arithmetic; OrthoReg offers strong practical utility as a plug-and-play regularizer.