Learning Conformational Ensembles of Proteins Based on Backbone Geometry¶

Conference: NeurIPS 2025 arXiv: 2503.05738 Code: GitHub Area: Medical Imaging / Computational Biology Keywords: protein conformational ensembles, flow matching, backbone geometry, molecular dynamics simulation

TL;DR¶

This paper proposes BBFlow, a flow matching generative model based on protein backbone geometry for conformational ensemble sampling. BBFlow requires neither evolutionary sequence information nor pretrained folding models, achieves inference speeds more than an order of magnitude faster than AlphaFlow, and generalizes to multi-chain proteins.

Background & Motivation¶

Protein function depends on structural dynamics—the conformational ensemble accessible at thermodynamic equilibrium (Boltzmann distribution). Molecular dynamics (MD) simulation is the conventional tool for sampling such ensembles, but MD requires prohibitively long simulation times to overcome local free-energy minima, incurring enormous computational cost.

Deep generative models have recently been proposed as alternatives to MD. Current state-of-the-art methods such as AlphaFlow rely on fine-tuning pretrained folding models (e.g., AlphaFold 2) and require evolutionary sequence information (MSAs or protein language model weights). This introduces three core problems:

Efficiency bottleneck: Dependence on large pretrained folding models requires predicting the overall fold from sequence at every inference step.

Information bias: Evolutionary information (MSAs) is unavailable or extremely scarce for de novo designed proteins, leading to modeling bias.

Limited applicability: Existing methods are restricted to single-chain proteins and cannot handle multi-chain protein complexes.

The core idea of this work is to decouple conformational ensemble generation from structure prediction, learning the conformational distribution solely from the geometric information of the protein backbone rather than from sequence information. By conditioning on geometric encodings of the equilibrium structure, the method eliminates the need for evolutionary information while substantially improving inference efficiency.

Method¶

Overall Architecture¶

BBFlow frames conformational ensemble prediction as a conditional structure generation task: given the equilibrium backbone structure \(x_{\text{eq}}\) of a protein, learn the conditional probability distribution \(p(x|x_{\text{eq}})\). Concretely, an SE(3) flow matching model is adopted, representing the protein backbone as a sequence of Euclidean frames \(x = (r, z) \in \text{SE}(3)\) (rotation + translation), and learning a conditional flow vector field on the \(\text{SE}(3)^N\) manifold.

Key Designs¶

1. Geometric Encoding of the Equilibrium Structure¶

To incorporate the equilibrium structure as a conditional input, BBFlow introduces two complementary encoding schemes:

Distance encoding: Pairwise Euclidean distances between residues are discretized into bin features:

\[s_{ij} = \text{bin}(\|z_i - z_j\|_2)\]

The range 0–20 Å is divided uniformly into 22 bins, serving as initial edge features. This encoding is analogous to the contact maps provided by evolutionary information.

Orientation encoding: For residue pairs within 5 Å, equivariant pairwise direction vectors are computed:

\[e_{ij} = r_i^{-1}\left(\frac{z_i - z_j}{\|z_i - z_j\|_2}\right)\]

By transforming the direction vector into the local coordinate frame of residue \(i\), the feature components become invariant quantities that can be used together with the distance encoding as edge features.

Design Motivation: Distance encoding captures global spatial proximity relationships (analogous to contact maps), while orientation encoding provides finer-grained local geometric information. Together, these two encodings replace the role of evolutionary information in conventional approaches.

2. Conditional Prior Distribution¶

Unlike conventional flow matching, which uses an unconditional prior, BBFlow proposes a conditional prior distribution \(p_0(x|x_{\text{eq}})\). Prior samples are generated by geodesic interpolation between unconditional prior samples and the equilibrium structure:

\[x_0 = \gamma(x_{\text{uncond}}, x_{\text{eq}}, \xi)\]

where \(\gamma\) denotes the geodesic between \(x_{\text{uncond}}\) and \(x_{\text{eq}}\), and \(\xi = 0.2\) controls the proximity of prior samples to the equilibrium structure. This can be viewed as a generalization of partial denoising from diffusion models to the flow matching framework.

Design Motivation: The conditional prior ensures that the initial noise distribution already encodes coarse structural information about the target protein, reducing the learning burden on the flow matching model and enabling high-quality conformations to be generated with fewer time steps.

3. Model Architecture¶

BBFlow adopts the GAFL architecture (an extension of FrameDiff), whose core is an SE(3)-equivariant graph neural network employing Clifford Frame Attention (CFA). Key innovations include:

Removal of residue index encoding: Residue position indices within the chain are not used as input features, since this information is already implicitly encoded in the distance matrix of the equilibrium structure. This reduces memorization effects and allows the model to transfer naturally from single-chain to multi-chain proteins.
Amino acid type encoding: One-hot encoding followed by a linear projection into 128-dimensional embeddings, providing information about local degrees of freedom.
6 message-passing blocks that iteratively update frames and node/edge features.

Loss & Training¶

The flow matching loss is the mean squared error of the conditional flow vector field:

\[\mathcal{L}_{\text{FM}} = \mathbb{E}\left[\|v - \hat{v}(x_t, t, x_{\text{eq}})\|^2_{\text{SE}(3)}\right]\]

where the metric is defined as \(\|v\|^2_{\text{SE}(3)} = \text{Tr}(v_r v_r^T)/2 + \|v_z\|^2_2\), and auxiliary losses are also employed.

Training is conducted on the ATLAS dataset (1,265 proteins for training). Training from scratch requires only 2 A100 GPUs for 3 days, with the number of time steps set to 20.

Key Experimental Results¶

Main Results (ATLAS Benchmark)¶

Method	RMSF \(r\)↑	RMSF MAE↓	Pw-RMSD MAE↓	DCCM \(r\)↑	PCA \(\mathcal{W}_2\)↓	Inference Time (s)↓
AlphaFlow	0.86	0.59	1.35	0.86	1.47	32.0
AlphaFlow-T	0.92	0.41	0.91	0.89	1.28	32.6
AlphaFlow-T_dist	0.92	0.68	1.41	0.88	1.43	3.3
ConfDiff	0.88	0.62	1.45	0.86	1.41	20.2
BBFlow	0.90	0.42	0.77	0.87	1.33	0.8

BBFlow achieves accuracy comparable to AlphaFlow-T while being approximately 40× faster at inference.

Ablation Study¶

Configuration	RMSF MAE↓	Pw-RMSD MAE↓	DCCM \(r\)↑	Notes
BBFlow (full)	0.42	0.77	0.87	All components
w/o orientation encoding	0.52	1.15	0.85	Distance encoding only
w/o conditional prior	0.48	0.90	0.86	Unconditional prior
w/ residue index	0.42	0.82	0.88	Position indices added
w/o amino acid encoding	0.54	0.93	0.85	Pure geometry only
w/o distance encoding	5.88	7.08	0.55	Complete failure

Key Findings¶

De novo proteins: BBFlow performs robustly on de novo proteins (RMSF MAE = 0.26), whereas AlphaFlow (without templates), which relies on evolutionary information, fails severely (MAE = 4.76).
Multi-chain proteins: BBFlow is the first conformational ensemble generation model applicable to multi-chain proteins; despite being trained exclusively on single-chain data, it successfully captures both inter-chain and intra-chain motion correlations.
Speed–accuracy trade-off: BBFlow achieves the best speed–accuracy balance among all baselines, requiring only 0.8 seconds per conformation for a 300-residue protein.

Highlights & Insights¶

Geometric information as a substitute for evolutionary information: This work demonstrates that protein conformational sampling does not necessarily require evolutionary sequence information—backbone geometry alone is sufficient to reach state-of-the-art performance, with far-reaching implications for the field of computational biology.
Conditional prior as a key innovation: The generalization of partial denoising to the flow matching framework via a conditional prior constitutes a broadly applicable technical contribution.
Residue index removal enables cross-chain transfer: By replacing positional indices with structural encodings, the method elegantly solves the single-chain-to-multi-chain transfer problem.

Limitations & Future Work¶

As a surrogate for MD simulation, BBFlow cannot predict conformations far from equilibrium (e.g., alternative folded states) unless trained on correspondingly long MD simulation data.
Transient contact prediction accuracy is lower than that of MSA-based methods, suggesting that evolutionary information still has value for predicting rare events.
The current model generates backbone conformations only, without side-chain conformational ensembles or protein–ligand interactions.

The AlphaFlow series achieves conformational sampling by fine-tuning AlphaFold 2, but is constrained by the pretrained model.
FrameFlow / GAFL employs SE(3) flow matching for protein design; BBFlow adapts this framework to conditional generation.
The conditional prior concept is generalizable to other conditional generation tasks.

Rating¶

Novelty: ⭐⭐⭐⭐⭐ Conditional prior + geometric encodings as a replacement for evolutionary information; the core idea is novel and thoroughly validated.
Experimental Thoroughness: ⭐⭐⭐⭐⭐ Covers natural proteins, de novo proteins, and multi-chain proteins, with comprehensive ablation studies.
Writing Quality: ⭐⭐⭐⭐ Clear structure, rigorous mathematics, and rich figures and tables.
Value: ⭐⭐⭐⭐⭐ 40× speedup, no evolutionary information required, and multi-chain generalization—practical application value is extremely high.