χ: Symmetry Understanding of 3D Shapes via Chirality Disentanglement¶

Conference: ICCV 2025 arXiv: 2508.05505 Code: Project Page Area: 3D Vision Keywords: Chirality features, symmetry, shape matching, left-right disambiguation, 2D foundation model distillation

TL;DR¶

This paper proposes an unsupervised chirality feature extraction pipeline that distills left-right chirality information from 2D foundation model features to augment 3D shape vertex descriptors, effectively resolving left-right ambiguity in shape analysis.

Background & Motivation¶

Root Cause¶

Key Challenge: Background: Symmetry and chirality are two sides of the same coin: symmetry focuses on the similarity between two parts, while chirality focuses on their differences. In shape analysis, many vertex descriptors (e.g., Diff3F) exhibit semantic and geometric robustness but cannot distinguish left-right symmetric parts, leading to:

Left-right ambiguity in shape matching — the left eye may be matched to the right eye

Imprecise part segmentation — symmetric body parts cannot be differentiated

Degraded correspondence quality — especially on models with symmetric structures

Although visual chirality has been studied in 2D image domains, no method exists for extracting chirality-aware vertex descriptors in 3D shape analysis.

Method¶

Overall Architecture¶

Render textured images $\{I_j\}_{j=1}^N$ from $N$ viewpoints for a 3D mesh
Horizontally flip each image to obtain $\{\bar{I}_j\}_{j=1}^N$
Extract features $F_{img}$ and $\bar{F}_{img}$ via frozen SD+DINO respectively
Project onto the mesh to obtain chirality feature pairs $(\mathcal{F}_v, \bar{\mathcal{F}}_v)$
Train a chirality network $\tilde{g}_\Phi$ to extract chirality features $\chi, \bar{\chi}$ from the feature pairs

Key Designs¶

Chirality Feature Definition

\[\chi_v := \frac{[\tilde{g}(\mathcal{F}_v)]_1}{\|\tilde{g}(\mathcal{F}_v)\|_2}\]

The first dimension is selected and normalized, ensuring $\chi_v \in [-1, 1]$.

Loss & Training¶

Dissimilarity Loss — maximizes the difference between original and flipped chirality features: $$\mathcal{L}_{dis} = -\frac{1}{\sqrt{|V|}}\|\chi - \bar{\chi}\|_2$$

Invertibility Loss — prevents the encoder from learning degenerate solutions: $$\mathcal{L}_{inv} = \frac{1}{\sqrt{|V|}}\|[\mathcal{F}^\top\;\bar{\mathcal{F}}^\top]^\top - h(g([\mathcal{F}^\top\;\bar{\mathcal{F}}^\top]^\top))\|_F$$

Total Variation Loss — enforces spatial smoothness: $$\mathcal{L}_{var} = \frac{1}{|E|}\sum_{(u,v) \in E} \|\chi_u - \chi_v\|_1 + \|\bar{\chi}_u - \bar{\chi}_v\|_1$$

Fifty-Fifty Loss — balances the number of vertices in the left and right halves: $$\mathcal{L}_{fif} = \frac{1}{|V|}(\frac{|\chi^\top\mathbf{1}_{|V|}|}{\|\chi\|_\infty} + \frac{|\bar{\chi}^\top\mathbf{1}_{|V|}|}{\|\bar{\chi}\|_\infty})$$

Total loss: $\mathcal{L} = \mathcal{L}_{dis} + \lambda_1\mathcal{L}_{inv} + \lambda_2\mathcal{L}_{var} + \lambda_3\mathcal{L}_{fif}$

Key Experimental Results¶

Left-Right Discrimination Accuracy¶

Main Results¶

Train/Test	BeCoS	FAUST	SCAPE	SMAL	TOSCA
Diff3F	50.87	51.21	52.53	50.91	51.48
DINO+SD	51.16	51.05	52.55	50.80	51.42
Liu et al.	79.98	90.45	80.84	75.71	72.88
χ (Ours)	91.84	94.76	95.51	96.59	94.09

Cross-Dataset Generalization¶

Ablation Study¶

Training Set	BeCoS-h Test	BeCoS-a Test
BeCoS	94.09	84.19
BeCoS-h	90.36	91.10

Key Findings¶

Raw Diff3F/DINO+SD features are nearly incapable of distinguishing left from right (~50%, close to random chance)
The proposed method achieves over 90% left-right discrimination accuracy across all datasets
Strong cross-dataset and cross-category generalization, effective even on partial and anisotropic shapes
Combining chirality features with Diff3F effectively alleviates left-right ambiguity in shape matching

Highlights & Insights¶

Clever use of horizontal flipping — flipping images alters chirality information while preserving other semantic content, enabling the construction of chirality feature pairs
Unsupervised approach — requires no left-right annotations; chirality is learned purely from geometric structure
Plug-and-play enhancement — compatible with any existing vertex descriptor
Knowledge distillation from 2D to 3D — effectively exploits chirality information implicitly encoded in 2D foundation models

Limitations & Future Work¶

Relies on the rendering and texturing pipeline of Diff3F, incurring substantial computational overhead
Chirality is ill-defined for perfectly symmetric objects (e.g., spheres)
Careful hyperparameter tuning is required to balance the four loss terms

Visual Chirality: Lin et al. visual chirality, mirror detection
Shape Descriptors: Diff3F, DINO-V2, Stable Diffusion features
Shape Matching: functional maps, SE-ORNet, DPC

Rating¶

Novelty: ⭐⭐⭐⭐⭐ (pioneering work on chirality extraction for 3D shapes)
Technical Depth: ⭐⭐⭐⭐ (four carefully designed loss functions)
Experimental Thoroughness: ⭐⭐⭐⭐ (validation across multiple datasets and tasks)
Value: ⭐⭐⭐⭐ (directly improves shape matching quality)