Towards on-device learning and reconfigurable hardware implementation for encoded single-photon signal processing
Zang, Zhenya and Li, Xingda and Li, David (2026) Towards on-device learning and reconfigurable hardware implementation for encoded single-photon signal processing. IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems. ISSN 1937-4151 (https://doi.org/10.1109/TCAD.2026.3685523)
Preview |
Text.
Filename: Zang-etal-IEEE-TCADICS-2026-Towards-on-device-learning-and-reconfigurable-hardware-implementation.pdf
Accepted Author Manuscript License:
Download (12MB)| Preview |
Abstract
Deep neural networks (DNNs) enhance the accuracy and efficiency of reconstructing key parameters from time- resolved photon arrival signals recorded by single-photon detec- tors. However, the performance of conventional backpropagation- based DNNs is dependent on the optical setup and the biological samples, necessitating frequent retraining of the network, either via transfer learning or from scratch. Newly collected data must be stored and transferred to a high-performance GPU server for retraining, introducing latency and storage overhead. To address these challenges, we propose an online training algorithm based on a one-sided Jacobi rotation-based online sequential extreme learning machine (OSOS-ELM). We fully exploit par- allelism and partition-independent functions to execute OSOS- ELM on a heterogeneous FPGA with integrated ARM cores. Extensive evaluations of OSOS-ELM and OS-ELM demonstrate that both achieve comparable accuracy across different network dimensions (i.e., input, hidden, and output layers), while OSOS- ELM is more hardware-efficient. By leveraging the parallelism of OSOS-ELM, we implement a holistic computing prototype on an Xilinx ZCU104 FPGA. We validate our approach through three typical case studies involving single-photon signal analysis: fog sensing with a commercial single-photon LiDAR, fluorescence lifetime estimation in fluorescence lifetime imaging, and blood flow index reconstruction in diffuse correlation spectroscopy, all of which use one-dimensional data encoded from photonic signals. From a hardware perspective, we optimize the OSOS- ELM workload by employing multi-tasked processing on ARM CPU cores and pipelined execution on the FPGA’s logic fabric. We also implement our OSOS-ELM on the NVIDIA Jetson Xavier NX GPU to comprehensively evaluate its computational performance on another heterogeneous computing platform.
ORCID iDs
Zang, Zhenya, Li, Xingda and Li, David
ORCID: https://orcid.org/0000-0002-6401-4263;
-
-
Item type: Article ID code: 96043 Dates: DateEvent20 April 2026Published20 April 2026Published Online16 April 2026AcceptedSubjects: Medicine > Biomedical engineering. Electronics. Instrumentation Department: Faculty of Engineering > Biomedical Engineering
Strategic Research Themes > Health and WellbeingDepositing user: Pure Administrator Date deposited: 20 Apr 2026 14:07 Last modified: 02 Jun 2026 07:12 URI: https://strathprints.strath.ac.uk/id/eprint/96043
Tools
Tools






