Skip to content

Quantifying the Role of OpenFold Components in Protein Structure Prediction

Conference: NeurIPS 2025 (Workshop)
arXiv: 2511.14781
Code: None (based on the OpenFold open-source implementation)
Area: Protein Structure Prediction / Interpretability
Keywords: OpenFold, AlphaFold2, Evoformer, Component Ablation, Protein Length

TL;DR

This paper proposes a systematic methodology for evaluating the contribution of individual Evoformer components in OpenFold/AlphaFold2 to protein structure prediction accuracy. The study finds that MSA column attention and MLP Transition layers are the most critical components, and that the importance of multiple components is significantly correlated with protein sequence length.

Background & Motivation

AlphaFold2 and OpenFold have revolutionized the field of protein structure prediction, yet their internal mechanisms remain poorly understood. The core architecture, Evoformer, comprises diverse components—attention layers, Transition MLPs, triangular update operations, among others—but the relative contribution of each component to prediction accuracy has not been established.

Existing ablation studies have primarily focused on auxiliary losses, training strategies, or coarse-grained architectural changes (e.g., "removing all triangular operations"), lacking systematic evaluation of individual Evoformer components. This paper addresses that gap by conducting per-component skip/zeroing experiments to reveal which components are universally critical, which are dispensable, and how importance varies with protein properties.

Given that subsequent models such as AlphaFold3 and Boltz retain the same Transformer-plus-triangular-operations architecture, the findings of this work carry broad applicability.

Method

Overall Architecture

OpenFold's protein structure prediction proceeds in three stages: 1. Preprocessing: Generates MSA (multiple sequence alignment) representations and Pair (residue-pair) representations. 2. Evoformer Processing: Iteratively refines both representations through 48 Evoformer blocks. 3. Structure Module: Outputs the 3D structure from the refined representations.

Each Evoformer block contains two pathways: - MSA pathway: MSA row attention → MSA column attention → MSA Transition (MLP) - Pair pathway: Outer product mean (connecting MSA→Pair) → Triangular multiplicative update → Triangular attention → Pair Transition (MLP)

Key Designs

Three categories of ablation experiments

  1. Skip attention modules: Bypass the operations of specific attention layers across all 48 Evoformer blocks.
  2. Skip non-attention modules / zero representations: Skip MLP layers or zero out the final representations before the structure module.
  3. Length correlation analysis: Fit a linear regression of ΔTM-score against protein sequence length and compute Spearman correlation.

Data filtering strategy

A three-month CAMEO subset is used (proteins with sequence length < 700). Targets with missing structure files or baseline TM-score < 0.7 are excluded, yielding a final set of 154 proteins.

Loss & Training

OpenFold model_1_ptm weights and original AlphaFold2 JAX weights are used. Zero recycles are applied, and unrelaxed structure predictions are evaluated. Each protein is run three times and results are averaged.

Key Experimental Results

Main Results

Attention component ablation (Figure 2a)

Ablation Median ΔTM Impact
Skip MSA column attention Largest deviation Most critical
Skip MSA row attention Minor impact Small effect on most proteins
Skip triangular attention Negligible impact Ignorable for most proteins
Only MSA column attention 0.089 This single component preserves most performance
Only MSA row attention Large drop Insufficient alone for structure prediction
Only triangular attention Large drop Insufficient alone for structure prediction

Non-attention component ablation (Figure 2b)

Ablation Median ΔTM Impact
Skip Pair Transition 0.765 Highly critical
Skip MSA Transition 0.829 Most critical
Zero MSA representation Minimal Small effect on most proteins
Zero Pair representation Large drop Highly critical
Skip triangular multiplicative update High variance Protein-dependent

Ablation Study

Correlation analysis between component importance and protein length (Table 1)

Ablation \(R^2\) Spearman \(\rho\) \(p\)-value Trend
Skip MSA column attention 0.13 0.40 1.9e-7 Longer proteins more dependent
Only MSA column attention 0.02 -0.13 0.11 No significant correlation
Skip MSA row attention 0.01 -0.07 0.42 No significant correlation
Skip triangular attention 0.02 -0.19 0.018 Shorter proteins more dependent
Skip MSA Transition 0.09 0.34 1.2e-5 Longer proteins more dependent
Zero MSA representation 0.21 0.46 1.3e-9 Longer proteins more dependent
Skip triangular multiplicative update 0.06 0.08 0.31 No significant correlation
Skip Pair Transition 0.26 0.56 3.8e-14 Longer proteins more dependent
Zero Pair representation 0.11 0.38 1.1e-6 Longer proteins more dependent

Key Findings

  1. MSA column attention is the most critical attention component: Retaining it alone recovers the majority of baseline performance (median ΔTM of only 0.089), indicating that OpenFold heavily relies on evolutionary sequence information.
  2. MLP Transition layers are indispensable: Skipping MSA/Pair Transitions causes the largest performance drops (0.765–0.829), consistent with findings in the Transformer interpretability literature that MLP layers encode critical semantic information.
  3. Pair representations are more important than MSA representations: Zeroing Pair representations causes a substantial drop, whereas zeroing MSA representations has minimal effect on most proteins.
  4. Among triangular operations, multiplicative updates are more important than triangular attention: Triangular attention has negligible impact on most proteins, while multiplicative updates exhibit high variance across proteins.
  5. Length dependence: Longer proteins rely more heavily on MSA-related features, while shorter proteins rely more on triangular attention—indicating that different proteins depend on different Evoformer components.

Highlights & Insights

  • This work presents the first systematic per-component ablation of the Evoformer at a granularity far exceeding that of prior studies.
  • The finding that MSA column attention alone is sufficient highlights the central role of evolutionary sequence information in structure prediction.
  • The length-dependence findings provide a new perspective for understanding prediction mechanisms across different protein types.
  • The heterogeneous contributions of triangular operations—multiplicative updates are critical but highly variable, while attention is nearly negligible—challenges the simplistic view that triangular operations are uniformly important.
  • The work transfers methodological tools from Transformer interpretability research to the protein structure prediction domain.

Limitations & Future Work

  1. Only 154 proteins from a CAMEO subset are used, limiting statistical power.
  2. The influence of fold type on component importance is not analyzed, which may be a key factor explaining the observed heterogeneity.
  3. Component ablations are applied globally (all 48 blocks skipped simultaneously), leaving per-block or per-layer importance differences unexplored.
  4. The use of zero recycles simplifies the experimental setup, and the contribution of the recycling mechanism itself is not fully examined.
  5. The learning dynamics of individual components during training are not analyzed.
  • AlphaFold2 interpretability: ExplainableFold (via residue deletion/substitution), SHAP-based analyses, etc. The present work is complementary in its focus on architectural components.
  • Protein language model interpretability: Sparse autoencoder analyses of ESM-2, studies correlating attention maps with protein properties.
  • Transformer interpretability: The finding that MLP layers encode critical semantic information aligns with the observed criticality of Transition layers in this work.
  • This paper provides guidance for architectural optimization of subsequent models such as AlphaFold3 and Boltz.

Rating

  • Novelty: ★★★☆☆ (Methodology is relatively straightforward, but the research question is important and underexplored)
  • Experimental Design: ★★★★☆ (Systematic and comprehensive, covering attention, non-attention, representation, and length dimensions)
  • Practicality: ★★★★☆ (Directly informative for the optimization and simplification of protein structure prediction models)
  • Clarity: ★★★★★ (Well-structured paper with intuitive figures and tables)