Single Pixel Image Classification using an Ultrafast Digital Light Projector¶

Conference: ICLR 2026 arXiv: 2603.12036 Code: None Area: Autonomous Driving Keywords: Single-pixel imaging, image classification, microLED, Hadamard patterns, extreme learning machine

TL;DR¶

This paper presents an experimental single-pixel imaging (SPI) system based on a microLED-on-CMOS ultrafast digital light projector, combined with low-complexity machine learning models (ELM and DNN) to achieve sub-millisecond image encoding and kHz-rate image classification. The system attains >90% accuracy on the MNIST dataset and >99% AUC in binary classification scenarios.

Background & Motivation¶

Background: Machine vision is a mature technology embedded in autonomous agents such as self-driving vehicles; however, the operational bandwidth of conventional digital cameras is becoming a bottleneck. Event cameras reduce data volume in dynamic scenes but are constrained to the visible and near-infrared spectrum.
Limitations of Prior Work:
- Conventional SPI systems employ DMDs (digital micromirror devices) for pattern generation, which are limited by mechanical switching speeds (~\(10^4\) fps), keeping overall imaging rates comparable to standard CMOS cameras (\(\lesssim 10^2\) Hz).
- Most existing single-pixel image classification (SPIC) work relies on simulation or low-speed experiments, lacking true ultrafast optical experimental validation.
- The image reconstruction step introduces additional latency and computational overhead.
Key Challenge: SPI requires projecting long sequences of patterns to acquire sufficient information, and projection speed constitutes the bandwidth bottleneck; compressed sensing can reduce the number of patterns but at the cost of classification accuracy.
Goal: To experimentally validate a single-pixel image classification system based on ultrafast microLED projection that performs direct classification on photodetector time series without image reconstruction.
Key Insight: Leveraging the ~100× faster switching speed of microLED arrays compared to DMDs to bypass image reconstruction entirely and classify directly on spatiotemporally transformed data.
Core Idea: Reformulating image classification from the spatial domain to the spatiotemporal domain—each image is encoded as a light-intensity time series and classified directly by a low-complexity ML model.

Method¶

Overall Architecture¶

Hadamard pattern sequence → high-speed microLED projector → target image displayed on DMD → single-pixel detector captures superimposed light-intensity signal → real-time oscilloscope records time series → ML model performs direct classification (no reconstruction required).

Key Designs¶

Ultrafast Single-Pixel Imaging System:
- Core hardware: 128×128 microLED-on-CMOS array, 30×30 μm² pixels, 50 μm pitch, supporting MHz-rate frame refresh.
- Projects a 12×12 Hadamard pattern set (Had12) at 330,000 fps in global shutter mode.
- Image reconstruction formula: \(I_{(x,y),M} = \frac{1}{M}\sum_{m=1}^{M} S_m P_{(x,y),m}\), where \(S_m\) is the differential signal between each pair of complementary Hadamard patterns.
- Binarized MNIST images are displayed on a DMD (1024×768 resolution).
- An Onsemi SiPM single-pixel detector captures light intensity, recorded by a 1 GHz bandwidth oscilloscope.
- Design Motivation: The MHz-level switching speed of microLEDs overcomes the mechanical limitations of DMDs, enabling true kHz-rate SPI.
Extreme Learning Machine (ELM) Classifier:
- Single hidden-layer neural network with randomly initialized and fixed input weights.
- Hidden layer output: \(H = f(XW_{\text{in}} + b)\), using ReLU activation.
- Output weights solved in closed form via Ridge regression: \(\beta = (H^\top H + \alpha I)^{-1} H^\top T\).
- Multi-class prediction uses \(\hat{y} = \max(Y)\); binary classification uses a threshold of 0.5.
- Regularization parameter \(\alpha = 1.0\).
- Design Motivation: ELM training is extremely fast (no iterative optimization); inference requires only 31 μs/image, making it well-suited for ultrafast scenarios.
Deep Neural Network (DNN) Classifier:
- Feedforward DNN: input layer (286-dimensional) → three hidden layers with decreasing width + ReLU → softmax output.
- Adam optimizer, sparse categorical cross-entropy loss, trained for 300 epochs.
- Inference time: 73 μs/image.
- Design Motivation: Serves as a higher-complexity baseline for comparison with ELM, exploring the accuracy–speed trade-off.
Hadamard Pattern Subset Optimization:
- Low-index (low spatial frequency) Hadamard patterns are found to carry more classification-relevant information.
- Using only the first 1/4 of patterns preserves ≃78% classification accuracy.
- Cat1 (first 44 patterns) varies along a single spatial axis and captures coarse features; Cat2 (patterns 45–288) varies along two spatial directions and captures finer features.
- Design Motivation: Reducing the number of projected patterns proportionally increases the effective imaging bandwidth.

Loss & Training¶

ELM: closed-form Ridge regression solution, \(\alpha = 1.0\), no iterative training.
DNN: Adam optimizer, sparse categorical cross-entropy loss, 300 epochs.
Data: MNIST dataset (60K training, 10K testing), binarized and rescaled to fill the full DMD surface.

Key Experimental Results¶

Main Results¶

Method / Configuration	Accuracy (%)	Inference Speed	Notes
DNN + full Had12 (experimental)	>90	73 μs/image	1.2 kfps frame rate
ELM + full Had12 (experimental)	87.37	31 μs/image	2× faster than DNN
DNN + binarized MNIST (simulation)	97.50	—	Theoretical upper bound
ELM + binarized MNIST (simulation)	93.32	—	ELM upper bound
ELM binary classification (one-vs-all)	AUC >99%	—	Anomaly detection

Ablation Study¶

Configuration	Accuracy (%)	Notes
Full Had12 (DNN)	>90	All 144 patterns
First 1/2 Had12	~86	Minor accuracy drop
First 1/4 Had12	~78	Acceptable accuracy, ×4 bandwidth
First 1/8 Had12	~68	Significant accuracy drop
Last 1/2 Had12	~75	High-frequency patterns less informative
Random 1/2 Had12	~82	Between first/last halves
Gaussian noise σ=0.1	>95	Minimal noise impact
Gaussian noise σ=0.5	>95	Convergence maintained
Gaussian noise σ=1.0	~85	Notable drop and variance

Key Findings¶

The primary cause of accuracy degradation is not reduced effective SNR but spatial information loss due to compressed sensing.
Low spatial frequency Hadamard patterns contribute most to classification; high-frequency patterns are less informative.
Although ELM achieves lower accuracy than DNN, its inference speed is 2× faster, making it suitable for extreme real-time scenarios.
When the number of patterns is reduced, the DNN exhibits longer vanishing gradient phases, consistent with the nature of compressed inputs.
In binary classification, AUC approaches 1.0, indicating suitability for anomaly detection in rapidly changing scenes.

Highlights & Insights¶

This work provides the first experimental validation of single-pixel image classification at kHz frame rates, surpassing the speed limitations of conventional imaging systems.
Bypassing image reconstruction entirely greatly simplifies the system pipeline and reduces latency.
The minimalist design of the ELM model aligns well with the demands of ultrafast scenarios, offering fast training, fast inference, and low overhead.
The frequency-domain analysis of Hadamard pattern subsets offers practical guidance for compression strategy design.
The comparative experiments on noise versus compressed sensing provide valuable theoretical insights.

Limitations & Future Work¶

Validation is limited to the MNIST dataset, which is relatively simple and far from real-world machine vision scenarios.
The 12×12 Hadamard pattern resolution is low, limiting the system's ability to resolve complex images.
The storage depth of the current FPGA board constrains the size of the pattern set.
The SiPM detector and oscilloscope on the sensing side are difficult to miniaturize and integrate.
More complex ML models (e.g., CNNs) and larger-scale datasets have not been explored.
Substantial work remains to transfer the approach from MNIST to practical autonomous driving scenarios.

Compressed sensing theory provides the mathematical foundation for reducing the number of projected patterns.
The application of microLED arrays in analog optical computing underscores their central role in next-generation optical computing architectures.
Reconstruction-free SPIC methods have advanced rapidly in recent years; this paper represents the fastest experimentally validated instance among them.
The combination of low-complexity models such as ELM and reservoir computing with optical hardware is a promising research direction.
Single-pixel imaging technology holds unique advantages in non-visible spectral bands (terahertz, ultraviolet).

Rating¶

Novelty: ⭐⭐⭐⭐
Experimental Thoroughness: ⭐⭐⭐
Writing Quality: ⭐⭐⭐⭐
Value: ⭐⭐⭐