Unified Privacy Guarantees for Decentralized Learning via Matrix Factorization¶

Conference: ICLR 2026 arXiv: 2510.17480 Code: None Area: AI Security / Differential Privacy / Decentralized Learning Keywords: decentralized learning, matrix factorization, differential privacy, correlated noise, gossip protocol

TL;DR¶

This paper unifies diverse algorithms and trust models in decentralized learning (DL) under a matrix factorization (MF) mechanism framework, extends privacy guarantees to more general matrix types, and proposes the MAFALDA-SGD algorithm that significantly outperforms existing methods on both synthetic and real-world graph topologies by optimizing noise correlation.

Background & Motivation¶

Background: Decentralized learning (DL) enables users to collaboratively train models over peer-to-peer communication graphs without sharing raw data. Strong privacy guarantees are typically achieved via differential privacy (DP). Centralized DP-SGD has a mature MF-based noise correlation analysis framework that leverages temporal noise correlations to improve privacy-utility trade-offs. However, MF methods had previously only been applied in centralized settings.

Limitations of Prior Work: (1) DP analysis in DL relies on ad hoc proofs tailored to specific algorithms and trust models, lacking a unified framework; (2) existing analyses fail to fully exploit the noise correlations arising from redundant exchanges in peer-to-peer communication, leading to overly pessimistic privacy guarantees; (3) existing MF theory requires the workload matrix to be square, full-rank, and lower-triangular — conditions not satisfied in DL settings.

Key Challenge: The matrices arising from the distributed communication structure of DL violate the assumptions of existing MF theory; moreover, adversaries under different trust models (LDP, PNDP, SecLDP) possess different levels of knowledge, necessitating a unified treatment.

Goal: (1) Encode DL algorithms as a single matrix multiplication; (2) unify adversary knowledge representations across different trust models; (3) extend MF-based DP guarantees to rectangular and possibly rank-deficient matrices; (4) design new algorithms by optimizing noise correlations.

Key Insight: All \(T\) iterations of decentralized SGD can be unrolled into a single matrix equation \(\theta = (I_T \otimes W)(M\theta_0 - \eta \mathbf{W}_T(G + C^\dagger Z))\), thereby bringing DL into the MF analysis framework.

Core Idea: Represent the gradient-noise interaction in decentralized learning uniformly as \(\mathcal{O}_\mathcal{A} = AG + BZ\) (where \(A = BC\)), generalize the GDP privacy guarantee of MF, and optimize the noise correlation matrix \(C\) to derive the new algorithm MAFALDA-SGD.

Method¶

Overall Architecture¶

The approach proceeds in two steps: (1) Unified modeling — define linear DL algorithms (Definition 4) and adversary knowledge (Definition 5), and prove that all existing DL algorithms and trust models can be expressed as \(\mathcal{O}_\mathcal{A} = AG + BZ\) (Theorem 6); (2) Algorithm design — optimize the local noise correlation matrix \(C_{local}\) under LDP constraints to obtain MAFALDA-SGD.

Key Designs¶

Matrix Factorization Encoding of Decentralized Learning:
- Function: Unroll multi-round DL algorithm iterations into a unified matrix form.
- Mechanism: \(n\) nodes execute \(T\) rounds on communication graph \(\mathcal{G}\): each round performs a local gradient step \(\theta_{t+1/2} = \theta_t - \eta(G_t + C_t^\dagger Z)\) followed by gossip averaging \(\theta_{t+1} = W\theta_{t+1/2}\). Stacking \(T\) rounds yields \(\theta = (I_T \otimes W)(M\theta_0 - \eta \mathbf{W}_T(G + C^\dagger Z))\), where \(\mathbf{W}_T \in \mathbb{R}^{nT \times nT}\) is a lower-triangular Toeplitz block matrix.
- Design Motivation: Encoding the temporal unrolling of DL as a single matrix enables direct application of MF theory.
Generalized MF Privacy Guarantee (Theorem 8):
- Function: Extend MF-based DP guarantees from square/full-rank/lower-triangular matrices to rectangular/rank-deficient/column-echelon matrices.
- Mechanism: Define the generalized sensitivity \(\text{sens}_\Pi(C; B) = \max_{G \simeq_\Pi G'}\|C(G-G')\|_{B^\dagger B}\), and prove that when \(A\) is in column-echelon form and \(A = BC\), the mechanism \(\mathcal{M}\) is \(1/\sigma\)-GDP. The key correction is that \(B^\dagger B\) projects onto the row space of \(B\), discarding gradient combinations that are unobservable to the adversary.
- Design Motivation: In DL, adversaries observe only a subset of messages, so the resulting matrix \(A\) is typically rectangular and rank-deficient.
MAFALDA-SGD Algorithm (Optimized Noise Correlation):
- Function: Maximize the privacy-utility trade-off under LDP by optimizing the local noise correlation matrix.
- Mechanism: Constrain \(C = C_{local} \otimes I_n\) (noise correlated only across time steps within each node), define the optimization objective \(\mathcal{L}_{opti}(\mathbf{W}_T, B, C) = \text{sens}_\Pi(C;B)^2 \|(I_T \otimes W)\mathbf{W}_T C^\dagger\|_F^2\), and solve for the optimal \(C_{local}\) via convex optimization.
- Design Motivation: The correlation patterns of existing methods (e.g., AntiPGD) are not optimized for decentralized settings and perform poorly when directly applied to DL.

Loss & Training¶

Standard DP-SGD training: per-sample gradients are clipped to norm \(\Delta_g\), and Gaussian noise \(Z \sim \mathcal{N}(0, \Delta_g^2 \sigma^2)^{nT \times d}\) is added. The gossip matrix \(W\) satisfies stochastic matrix conditions. Time-varying gossip matrices \(W_t\) are supported. Privacy accounting is performed using the Gaussian DP (GDP) framework.

Key Experimental Results¶

Main Results (Privacy Guarantee Comparison)¶

Algorithm	Trust Model	Privacy Guarantee	vs. Prior Analysis
DP-D-SGD	LDP	MF analysis (Ours)	Tighter
DP-D-SGD	PNDP	MF analysis (Ours)	Tighter
Zip-DL	SecLDP	MF analysis (Ours)	Tighter
MAFALDA-SGD	LDP	Optimized correlated noise	Significantly outperforms all existing methods

Ablation Study (Noise Correlation Strategy Comparison)¶

Strategy	Privacy-Utility Trade-off	Notes
No correlation (DP-D-SGD)	Baseline	Independent noise at each step
AntiPGD (centralized strategy)	Worse than baseline	Centralized correlation pattern unsuitable for DL
MAFALDA-SGD	Optimal	Correlation pattern optimized for decentralized settings

Key Findings¶

The unified framework yields tighter privacy guarantees for all existing DL algorithms under all trust models.
Noise correlation strategies effective in centralized settings (e.g., AntiPGD) perform worse when directly applied to DL, demonstrating the need for decentralization-specific optimization.
MAFALDA-SGD significantly outperforms existing methods on both synthetic graphs (ring, grid) and real-world social network graphs.
The column-echelon condition (a generalization of lower-triangularity) ensures that privacy guarantees remain valid under adaptive gradient queries.

Highlights & Insights¶

Strong theoretical contribution: The paper seamlessly extends MF theory from centralized to decentralized settings, unifying the fragmented landscape of DP-DL analysis. The generalization in Theorem 8 (rectangular/rank-deficient matrices + generalized sensitivity) constitutes the core contribution.
Practical value: The framework not only unifies the analysis of existing algorithms but also directly guides the design of new ones. MAFALDA-SGD demonstrates the substantial potential of optimizing noise correlations for privacy in DL.

Limitations & Future Work¶

MAFALDA-SGD currently supports only LDP; extending noise optimization to PNDP and SecLDP is a natural direction.
The computational cost of optimizing \(C_{local}\) grows with \(T\), potentially rendering it infeasible for large-scale, long-horizon training.
Experiments are primarily conducted in convex optimization settings; effectiveness in non-convex deep learning training remains to be examined.

vs. Cyffers et al. (2022/2024): Originators of PNDP, but their analysis targets specific algorithms; this paper provides a unified framework.
vs. Denisov et al. (2022): Founders of the centralized MF mechanism, requiring square/full-rank/lower-triangular matrices; this paper generalizes to more general cases.
vs. Biswas et al. (2025): Zip-DL employs ad hoc correlated noise without optimization; the proposed framework enables systematic optimization.

Rating¶

Novelty: ⭐⭐⭐⭐⭐ — Both the unified modeling framework and the generalized MF privacy guarantees represent significant theoretical contributions.
Experimental Thoroughness: ⭐⭐⭐ — Experiments are primarily theory-driven; large-scale deep learning scenarios are lacking.
Writing Quality: ⭐⭐⭐⭐ — The notation system is complex but definitions are clear, with a well-structured progression.
Value: ⭐⭐⭐⭐ — Provides theoretical infrastructure for the DP-DL field with potential long-term impact.