Skip to content

Novel Architecture of RPA In Oral Cancer Lesion Detection

Conference: CVPR 2026 arXiv: 2603.10928 Code: None Area: Medical Imaging / Oral Cancer Detection Keywords: Oral cancer detection, RPA automation, EfficientNetV2, design patterns, CNN classification

TL;DR

This work integrates software design patterns (Singleton + Batch Processing) into an EfficientNetV2B1-based oral cancer lesion detection Python pipeline, achieving a 60–100× inference speedup over conventional RPA platforms (UiPath/Automation Anywhere) — 0.06 s per image vs. 2.58 s — while maintaining diagnostic accuracy.

Background & Motivation

Background: Early detection of oral cancer is critical to patient survival. Robotic Process Automation (RPA) has been introduced into healthcare to automate repetitive workflows such as image processing, laboratory data management, and patient data analysis. Low-code RPA platforms such as UiPath and Automation Anywhere provide accessible workflow orchestration capabilities.

Limitations of Prior Work: (1) Conventional RPA platforms are highly inefficient for computationally intensive AI inference — approximately 78% of processing time is consumed by repeated model loading, activity switching, and data serialization, with only 22% devoted to actual inference. (2) Low-code environments inherently lack support for GPU batch processing and model caching, and serial image processing creates severe throughput bottlenecks. (3) Poor computational resource utilization renders such systems unacceptable in terms of cost and latency for high-throughput clinical scenarios.

Key Challenge: A fundamental tension exists between the workflow orchestration strengths of RPA platforms and their computational inefficiency — automated process management must be preserved while substantially improving inference throughput.

Goal: To optimize a Python inference pipeline through software engineering design patterns, achieving high-efficiency inference while retaining the workflow orchestration advantages of RPA.

Key Insight: Introducing the Singleton (single model load) and Batch Processing (batched inference) design patterns into AI clinical deployment pipelines.

Core Idea: Singleton eliminates repeated model-loading overhead + Batch Processing exploits GPU parallelism = 60–100× speedup.

Method

Overall Architecture

The system comprises two parallel pipelines: OC-RPAv1 (a basic Python pipeline performing per-image processing) and OC-RPAv2 (an optimized pipeline incorporating Singleton + Batch Processing). UiPath manages the automation workflow and invokes Python functions for inference. Both pipelines share the same CNN model.

Key Designs

  1. Singleton Design Pattern (Eliminating Repeated Model Loading):

    • Function: Ensures the CNN model is loaded only once and remains resident in memory throughout the entire lifecycle.
    • Mechanism: In conventional RPA pipelines, the model is re-instantiated on every prediction call. The Singleton pattern decouples model loading from inference and centralizes model lifecycle management.
    • Design Motivation: Model loading and data serialization account for approximately 78% of total processing time in traditional RPA pipelines, representing the dominant performance bottleneck. Eliminating redundant loading is the single most impactful optimization.
  2. Batch Processing Design Pattern (GPU Parallel Inference):

    • Function: Groups multiple images into a batch and performs inference in a single forward pass.
    • Mechanism: Leverages GPU parallel computation to reduce per-image kernel launch and memory transfer overhead. Upon completion, results for each image are automatically logged and the image is moved to a dedicated directory to ensure data integrity.
    • Design Motivation: Further compresses inference time beyond the Singleton baseline (from 0.28 s/image in OC-RPAv1 to 0.06 s/image in OC-RPAv2, an additional 4.7× speedup).
  3. CNN Classification Model (EfficientNetV2B1):

    • Function: 16-class classification of oral lesions.
    • Mechanism: An ImageNet-pretrained EfficientNetV2B1 serves as the backbone with 224×224×3 input. The final layer is replaced with a softmax fully connected layer. Two-stage training is employed: the backbone is frozen for 15 epochs (lr = 1e-3), followed by partial unfreezing for fine-tuning over 10 epochs (lr = 1e-5).
    • Dataset: Approximately 3,000 clinical oral images spanning 4 macro-categories (Healthy / Benign / OPMD / Oral Cancer) with 16 sub-classes. Five augmentation strategies are applied via Albumentations; classes with fewer than 200 samples undergo random oversampling.

Loss & Training

  • Loss function: Categorical cross-entropy
  • Optimizer: Adam, batch size = 32
  • Data split: Stratified sampling at 70% / 15% / 15%
  • Training techniques: Early stopping, model checkpointing (best validation accuracy), and ReduceLROnPlateau (halving the learning rate upon loss plateau)

Key Experimental Results

Main Results (Inference Efficiency Comparison on 31 Test Images)

Platform / Method Total Time (31 images) Avg. Time per Image Relative Speedup
UiPath 80 s 2.58 s 1× (baseline)
Automation Anywhere 75 s 2.42 s 1.07×
OC-RPAv1 (Basic Python) 8.65 s 0.28 s 9.2×
OC-RPAv2 (Python + Design Patterns) 1.96 s 0.06 s 43×

Ablation Study / Efficiency Analysis

Analysis Dimension Key Data Remarks
RPA platform overhead analysis ~78% spent on non-inference operations Model loading / data serialization are the primary bottleneck
Singleton contribution v1 (0.28 s) vs. RPA (2.58 s) Eliminating repeated loading yields 9.2× speedup
Batch Processing contribution v2 (0.06 s) vs. v1 (0.28 s) GPU parallelism delivers a further 4.7× speedup
Scalability estimate 2,500 images: UiPath ≈ 1.8 h, v2 < 3 min 40× operational efficiency improvement

Key Findings

  • RPA platforms are severely inefficient for computationally intensive tasks, with the majority of time consumed by non-inference overhead.
  • The Singleton pattern's elimination of repeated model loading is the largest single source of performance gain (~9×).
  • The introduction of design patterns does not affect diagnostic accuracy; only execution efficiency is improved.
  • A hybrid approach combining Python computation with RPA workflow orchestration represents the recommended best practice.

Highlights & Insights

  • This work provides the first systematic quantification of the efficiency bottleneck of conventional RPA platforms in AI inference scenarios (78% overhead attributable to non-inference operations).
  • Singleton and Batch Processing design patterns are introduced into RPA-based medical image analysis pipelines.
  • A reusable hybrid RPA + Python automation pattern is demonstrated.
  • The conclusion, while straightforward, carries practical significance: in clinical AI deployment, engineering optimization may contribute as much value as algorithmic improvement.

Limitations & Future Work

  • Extremely small test set: Only 31 test images are used, severely undermining statistical credibility.
  • Absent accuracy comparison: No classification accuracy, precision, or recall metrics are reported, and no diagnostic performance comparison across methods is provided.
  • No model-level contribution: EfficientNetV2B1 is adopted directly without architectural modification or domain-specific adaptation for oral lesions.
  • Writing quality: The paper is loosely structured, contains repeated passages, and some citations are insufficiently rigorous.
  • Limited clinical depth: The work focuses exclusively on inference speed without addressing interpretability, uncertainty quantification, or other clinically critical requirements.
  • Future work could explore the integration of additional design patterns such as Factory, Adapter, and Observer.
  • vs. Abdellaif et al. (LMV-RPA): LMV-RPA similarly explored the idea of augmenting RPA with Python; this paper further quantifies the acceleration attributable to design patterns (60–100×).
  • vs. CLASEG Framework: CLASEG provides a deep learning baseline for multi-class oral lesion classification and segmentation; this paper directly reuses its model architecture.
  • Essential Positioning: This work constitutes applied software engineering research (design patterns in AI deployment), rather than algorithmic innovation.
  • Insight: The practical adoption of clinical AI systems demands not only model accuracy but also engineering-level deployment efficiency — Singleton and Batch Processing represent the most fundamental yet most impactful optimizations available.

Rating

⭐⭐ (2/5)

  • Novelty ⭐⭐: Application of established design patterns to an RPA pipeline; no algorithmic innovation.
  • Experimental Thoroughness ⭐⭐: Test scale is extremely small (31 images); accuracy metric comparisons are absent.
  • Writing Quality ⭐⭐: Loosely structured with repeated passages and inconsistent citations.
  • Value ⭐⭐⭐: Offers practical reference for engineering-level clinical AI deployment, but academic contribution is limited.