Novel Architecture of RPA In Oral Cancer Lesion Detection¶

Conference: CVPR2025
arXiv: 2603.10928
Code: To be confirmed
Area: Medical Imaging
Keywords: Oral cancer detection, RPA, EfficientNetV2B1, Design patterns, Batch processing

TL;DR¶

This paper integrates Singleton and Batch Processing design patterns into a Python-based RPA automation pipeline, combining them with the EfficientNetV2B1 model for oral cancer lesion detection, achieving a 60-100× inference speedup compared to traditional RPA platforms such as UiPath and Automation Anywhere.

Background & Motivation¶

Early and accurate detection of oral cancer is critical for improving patient survival rates, but clinical workflows are still plagued by subjective human judgment, delays, and inconsistent decision-making.
Robotic Process Automation (RPA) has been utilized in healthcare to automate tasks such as image processing, laboratory data management, and patient data analysis.
Although existing RPA platforms (e.g., UiPath, Automation Anywhere) are user-friendly, they are highly inefficient for computationally intensive tasks: approximately 78% of processing time is spent on overhead (model reloading, activity transitions, data serialization), while only 22% is allocated to actual inference.
There is a need for a hybrid solution that combines the workflow orchestration benefits of RPA with the high-performance computing capabilities of Python.

Method¶

Dataset and Preprocessing¶

A dataset of approximately 3000 clinical oral images is classified into 4 major categories (Healthy, Benign, OPMD, Oral Cancer) and 16 subcategories.
Data split: 70% training, 15% validation, and 15% testing (stratified sampling).
Preprocessing: Pixel normalization to \([0, 1]\) and ImageNet mean/std standardization.
Data augmentation: Using the Albumentations library, 5 transformations are applied per training sample (flipping, rotation, brightness/contrast adjustment, random cropping), and random duplication is performed for classes with fewer than 200 samples.

Model Architecture¶

ImageNet pre-trained EfficientNetV2B1 is adopted as the feature extractor.
The input size is \(224 \times 224 \times 3\), with a fully connected Dense + Softmax layer appended at the top.
Two-stage training:
Feature extraction phase: Freezing base layers, 15 epochs, learning rate \(1\text{e-}3\).
Fine-tuning phase: Partially unfreezing deep layers, 10 epochs, learning rate \(1\text{e-}5\).
Adam optimizer + categorical cross-entropy, with a batch size of 32.
Early stopping + ReduceLROnPlateau + checkpointing based on the best validation accuracy.

RPA Implementation Comparison¶

OC-RPAv1: Python-based sequential RPA-style processing, loading the model to predict one image at a time.
OC-RPAv2: Integrates Singleton and Batch Processing design patterns.
- Singleton: The model is loaded only once and retained in memory, avoiding repeated reloading overhead.
- Batch Processing: Batches images to leverage GPU parallel inference.
- UiPath manages the automation pipeline and invokes Python functions to execute inference.

Workflow Synchronization and Security¶

Image batches are processed sequentially, and the next batch is only processed after each file is classified and logged, avoiding data collisions.
Try-Catch exception handling ensures workflow continuity.
Processing is conducted on local secure workstations with anonymized file paths and restricted access controls.
Processed files are moved to a separate directory to ensure data integrity.

Based on the CLASEG framework by Al-Ali et al., which integrates multi-class classification and segmentation for differential diagnosis of oral lesions.
Follows the LMV-RPA approach of Abdellaif et al., supplementing standard RPA with enhanced Python automation.
Kim et al. previously demonstrated the speedup effects of a hybrid RPA+Python architecture in computer-aided cancer detection on pathological images.
This paper further introduces design patterns (Singleton + Batch) into this hybrid architecture and quantifies the acceleration ratio.

Key Experimental Results¶

Inference Speed Comparison (31 Test Images)¶

Platform	Total Time	Average Time per Image
UiPath	80 s	2.58 s
Automation Anywhere	75 s	2.42 s
OC-RPAv1 (Python)	8.65 s	0.28 s
OC-RPAv2 (Python+DP)	1.96 s	0.06 s

OC-RPAv2 is ~43× faster than UiPath and ~40× faster than Automation Anywhere.
OC-RPAv2 is ~4.4× faster than OC-RPAv1.
The introduction of design patterns compresses the execution time of the Python pipeline from 8.65 s to 1.96 s.

Scalability Estimation¶

For 2500 images: UiPath requires 1.8 hours, while OC-RPAv2 takes less than 3 minutes.

Highlights & Insights¶

High Engineering Practicality: The combination of Singleton and Batch Processing design patterns is simple yet effective, lowering the barrier to deployment.
Substantial Acceleration: Achieving 60-100× acceleration compared to standard RPA platforms, showing clear value for clinical deployment.
Cost Reduction: Reducing hardware idle time and RPA licensing costs, with the paper claiming a 40× cost reduction.
Hybrid Architecture Approach: Leverages the strengths of both worlds, with RPA handling workflow orchestration and Python taking charge of computationally intensive inference.
16-Class Oral Lesion Classification: Covers 4 major categories (Healthy, Benign, OPMD, Oral Cancer) and several subcategories, providing fine-grained classification.

Limitations & Future Work¶

Extremely Small Test Set: Only 31 test images are used, indicating very low statistical reliability, which makes it impossible to draw robust speed benchmarking conclusions.
Lack of Classification Accuracy Metrics: The paper focuses disproportionately on speed comparison, failing to report critical classification metrics such as accuracy, precision, and recall on the test set.
Limited Technical Contribution: Singleton and Batch Processing are fundamental software engineering design patterns; the core "innovation" is closer to engineering optimization rather than novel academic contributions.
Unfair Comparison: The overhead of RPA platforms mainly stems from GUI automation and activity transitions; hence, comparing them directly to raw Python pipelines is inherently a comparison of different paradigms.
Poor Writing Quality: Contains redundant paragraphs, non-standard formatting, and incomplete references.
Absence of Clinical Validation: The system has not been deployed or validated in a real-world clinical setting, leaving scalability claims unsupported.
Significant Gap with CVPR Standard: The overall work resembles an engineering report rather than a top-tier conference paper, lacking academic depth.
No Comparison with State-of-the-Art Methods: Lacks comparative analysis of detection accuracy against mainstream CNN or ViT-based oral cancer detection methods.
Insufficient Dataset Details: Fails to report key dataset characteristics such as sample distribution among subcategories and precise sample counts post-augmentation.
No Ablation Study: Fails to isolate and validate the individual contributions of Singleton and Batch Processing.

Rating¶

Novelty: ⭐⭐ — Singleton and Batch Processing are foundational design patterns, lacking methodological innovation.
Experimental Thoroughness: ⭐ — Only 31 test images and no classification performance metrics, showing highly insufficient experimental design.
Writing Quality: ⭐⭐ — Redundant and repetitive text, chaotic formatting, with multiple duplicated paragraphs.
Value: ⭐⭐ — While the engineering workflow has practical reference value, the academic contribution is insufficient for top-tier publication.
Overall: ⭐⭐ — More suitable as an engineering technical report than as an academic paper reference.