OpenAnimals: Revisiting Person Re-Identification for Animals Towards Better Generalization¶
- Conference: ICCV 2025
- arXiv: 2410.00204
- Code: Submitted with the paper (OpenAnimals codebase)
- Area: Human Understanding
- Keywords: Animal Re-Identification, Person ReID Transfer, Open-Source Framework, Baseline Model, Cross-Species Generalization
TL;DR¶
This paper develops the OpenAnimals open-source framework, systematically revisiting the transferability of person re-identification methods to animal re-identification. It proposes ARBase, an animal-oriented strong baseline that substantially outperforms existing person ReID methods across multiple benchmarks.
Background & Motivation¶
Animal Re-Identification (Animal Re-ID) aims to identify individual animals within a given species, which is critical for wildlife conservation, population monitoring, and behavioral research. Although conceptually similar to Person Re-Identification (Person Re-ID), the two tasks differ fundamentally:
Species Diversity: Different species (hyenas, leopards, sea turtles, whale sharks) exhibit dramatically different visual appearances and behaviors.
Environmental Variability: Habitats range from savannas to oceans, introducing far greater variation than the relatively controlled urban settings of person ReID.
Pose Differences: Quadrupedal locomotion (hyenas/leopards) and aquatic movement (sea turtles/whale sharks) differ fundamentally from human bipedal walking.
Data Scarcity: Data collection and annotation in wild environments are difficult, resulting in far less available data than person datasets.
The central question is: Can the extensive techniques and methodologies accumulated in person re-identification be effectively transferred to animal re-identification? Existing research lacks a systematic analysis of this question.
Method¶
Overall Architecture¶
The work is organized into three parts:
- OpenAnimals Framework: A unified animal ReID platform built upon FastReID and WildLifeDatasets.
- Systematic Revisiting Experiments: Ablating key designs from person ReID methods (BoT, AGW, SBS, MGN) one by one on animal benchmarks.
- ARBase Model: A strong animal-oriented baseline constructed from insights gained in the revisiting experiments.
OpenAnimals Framework Design¶
Two core principles are followed:
- Person ReID Compatibility: Inherits the core layers of FastReID, enabling seamless integration of state-of-the-art person ReID methods.
- Multi-Species Support: Integrates the dataset organization strategy of WildLifeDatasets, supporting 30+ species within a unified framework.
The modular design encompasses five stages: Data, Backbone, Head, Loss, and Training & Testing.
ARBase Model Design¶
Drawing from key findings in the revisiting experiments, ARBase makes targeted, animal-oriented design choices across five modules:
Data Module: - Key Modification — Input Resolution: Person ReID uniformly uses portrait-aspect resolutions (e.g., \([256,128]\)) since humans are typically upright. Animals exhibit diverse poses; ARBase adopts a square resolution of \([384,384]\), a simple change with significant effect. - Only random horizontal flipping (\(p=0.5\)) is used; Random Erasing and AutoAug are removed.
Backbone Module: - ResNet-50 (ImageNet pre-trained) with last stride set to 1 for fine-grained features. - Instance-Batch Normalization (IBN) replaces standard BN: IN learns appearance-invariant features (adapting to diverse environments), while BN retains content information. - Multi-branch architecture: global branch + 2-part branch + 3-part branch (inspired by insights from MGN).
Head Module: Global Average Pooling + Linear + BNNeck (decoupling the feature spaces for triplet and cross-entropy losses).
Loss Module: - Triplet loss computed on features before BNNeck: \(L_{tp} = \frac{1}{N_b}\sum_{i=1}^{N_b}\text{max}(0, m + d_{pos}^i - d_{neg}^i)\) - Cross-entropy loss with label smoothing computed on features after BNNeck.
Training & Testing: Adam optimizer + Cosine Annealing learning rate schedule.
Experiments¶
Main Results: ARBase vs. Person ReID Methods¶
| Method | HyenaID R1/mAP | LeopardID R1/mAP | SeaTurtleID R1/mAP | WhaleSharkID R1/mAP |
|---|---|---|---|---|
| BoT | 58.64/34.96 | 54.92/27.65 | 84.01/41.92 | 52.54/20.86 |
| AGW | 56.36/32.72 | 54.10/28.67 | 85.17/46.18 | 50.76/21.11 |
| SBS | 51.82/30.56 | 51.23/26.54 | 84.01/44.63 | 47.46/18.84 |
| MGN | 55.91/31.08 | 53.69/28.21 | 86.05/46.67 | 50.25/21.47 |
| ARBase | 73.18/44.87 | 64.34/37.08 | 86.92/55.99 | 62.44/29.45 |
ARBase achieves Rank-1 improvements of 14.54% on HyenaID, 9.90% on WhaleSharkID, and 9.42% on LeopardID.
Key Findings from Revisiting Experiments¶
| Technique | Effective for Persons | Generalization to Animals |
|---|---|---|
| Random Erasing | ✓ | ✗ (negative effect on 3/4 datasets; disrupts subtle individual details) |
| Label Smoothing | ✓ | ✓ (beneficial on ≥3 datasets) |
| Last Stride=1 | ✓ | ✓ (consistently beneficial) |
| BNNeck | ✓ | ✓ (consistently beneficial) |
| Non-local Attention | ✓ | ✗ (inconsistent effects) |
| Gen-mean Pooling | ✓ | ✗ (inconsistent effects) |
| Weighted Triplet | ✓ | ✗ (inconsistent effects) |
| Freeze Training | ✓ | ✗ (removal improves performance) |
| AutoAug | ✓ | ✗ (removal improves performance) |
| Cosine Annealing | ✓ | ✓ (consistently beneficial) |
| Multi-Branch (MGN) | ✓ | ✓ (multi-granularity features also effective for animals) |
Ablation Study (Data & Backbone)¶
| Configuration | HyenaID R1/mAP | WhaleSharkID R1/mAP |
|---|---|---|
| BoT [256,128] | 58.64/34.96 | 52.54/20.86 |
| BoT [384,384] | 60.45/36.43 | 58.12/24.39 |
| ARBase w/o IBN | 69.09/43.58 | 61.93/29.28 |
| ARBase w/o MB | 71.36/42.87 | 61.42/27.78 |
| ARBase (Full) | 73.18/44.87 | 62.44/29.45 |
Simply adjusting the resolution to square raises BoT's Rank-1 on WhaleSharkID from 52.54% to 58.12% (+5.58%).
Ablation Study (Head, Loss, Training)¶
| Configuration | HyenaID R1/mAP | WhaleSharkID R1/mAP |
|---|---|---|
| w/o BNNeck | 64.55/39.23 | 44.42/22.29 |
| w/o Label Smoothing | 68.18/44.72 | 61.42/27.88 |
| w/o Cosine Annealing | 71.82/43.40 | 62.44/28.69 |
| ARBase (Full) | 73.18/44.87 | 62.44/29.45 |
BNNeck has a dramatic impact on WhaleSharkID (Rank-1 drops from 62.44% to 44.42%), confirming that decoupling the feature space is equally critical for animal ReID.
Highlights & Insights¶
- The systematic revisiting experiments reveal that many techniques widely accepted as effective in person ReID—such as Random Erasing and AutoAug—do not transfer to animal ReID, as inter-individual differences in animals are more subtle and random erasing may destroy discriminative details.
- The input resolution insight carries broad implications: the portrait-aspect resolutions long used in person ReID are entirely unsuitable for the diverse poses of animals, and switching to square resolution alone yields substantial gains.
- The introduction of IBN elegantly addresses the high environmental variability inherent in animal ReID.
- ARBase's design philosophy—simple yet targeted—achieves significant improvements without introducing complex modules.
Limitations & Future Work¶
- Only four animal species are evaluated; generalization to a broader range of species (e.g., birds, insects) remains unexplored.
- A fixed ResNet-50 backbone is used; stronger recent pre-trained models (e.g., DINOv2, CLIP) are not investigated.
- The horizontal partitioning assumption of the multi-branch architecture may be inappropriate for certain animals (e.g., snakes, fish).
- Spatiotemporal information from video sequences is not considered.
- Dataset scales are relatively small, and in-depth analysis of overfitting and generalization is insufficient.
Related Work & Insights¶
- Person ReID: BoT (bag of tricks), AGW (non-local attention + GeM pooling), SBS (AutoAug + cosine annealing), MGN (multi-granularity network).
- Animal ReID: HotSpotter (hand-crafted features), MegaDescriptor (multi-species pre-training), CLIP/DINOv2-based methods.
- Open-Source Frameworks: FastReID (person), WildLifeDatasets (animal dataset management).
Rating¶
| Dimension | Score |
|---|---|
| Novelty | ⭐⭐⭐ |
| Effectiveness | ⭐⭐⭐⭐⭐ |
| Clarity | ⭐⭐⭐⭐⭐ |
| Practical Value | ⭐⭐⭐⭐⭐ |
| Overall | 8.0/10 |