Publications
Google Scholar
* represents my supervised students
Preprints:
CLLoRA: An Approach to Measure the Effects of the Context Length for LLM Fine-Tuning, 2025
Ping Zhang*, Zhaorui Zhang, Sheng Di, Yao Xin, Benben LiuHLoRA: Efficient Federated Learning System for LLM Heterogeneous Fine-Tuning, 2025
Qianli Liu*, Zhaorui Zhang, Xin Yao, Benben LiuZCCL: Significantly Improving Collective Communication With Error-Bounded Lossy Compression, 2025
Jiajun Huang, Sheng Di, Xiaodong Yu, Yujia Zhai, Zhaorui Zhang, Jinyang Liu, Xiaoyi Lu, Ken Raffenetti, Hui Zhou, Kai Zhao, Khalid Alharthi, Zizhong Chen, Franck Cappello, Yanfei Guo, Rajeev Thakur
2025:
GlobaZip: An Interactive, Efficient Distributed Compression-as-a-Service Platform with Optimized Data Compression Techniques
Yuanjian Liu, Sheng Di, Zhaorui Zhang, Jiajun Huang, Kyle Chard, Ian Foster
IEEE Transactions on Parallel and Distributed Systems, Special Section on New Tools and Techniques for the Distributed Computing Continuum (DCC) (TPDS-SS), 2025
(CCF-A Journal for Distributed and Parallel Systems)FedCSpc: A Cross-Silo Federated Learning System with Error-Bounded Lossy Parameter Compression
Zhaorui Zhang, Sheng Di, Kai Zhao, Sian Jin, Dingwen Tao, Zhuoran Ji, Benben Liu, Khalid Ayed Alharithi, Jiannong Cao, Franck Cappello
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2025
(CCF-A Journal for Distributed and Parallel Systems)A Survey on Error-Bounded Lossy Compression for Scientific Datasets
Sheng Di, Jinyang Liu, Kai Zhao, Xin Liang, Robert Underwood, Zhaorui Zhang, Milan Shan, Yafan Huang, Jiajun Huang, Xiaodong Yu, Congrong Ren, Hanqi Guo, Grant Wilkins, Dingwen Tao, Jiannan Tian, Sian Jin, Zizhe Jian, Daoce Wang, MD Hasanur Rahman, Boyuan Zhang, Shihui Song, Jon C. Calhoun, Guanpeng Li, Kazutomo Yoshii, Khalid Ayed Alharthi, Franck Cappello
ACM Computing Survey, 2025
(Impact Factor: 23.8, Ranked 1/143 in Computer Science Theory & Methods)Recursive Confidence Training for Pseudo-Labeling Calibration in Semi-Supervised Few-Shot Learning
Kunlei Jing, Hebo Ma, Lei Wen, Chen Zhang, Zhaorui Zhang, Lionel Ni
Transactions on Image Processing, (TIP), 2025
(CCF-A Journal for Image Processing)StoreLLM: Energy Efficient Large Language Model Inference with Permanently Pre-stored Attention Matrices
Dan Wang, Boan Liu, Rui Lu, Zhaorui Zhang, Shuntao Zhu
The 16th ACM International Conference on Future and Sustainable Energy Systems, (ACM e-Energy), 2025REFS: A Novel Framework for Accelerated Receive Encrypted Flow Steering
Zengxie Ma, Yao Xin, Ning Hu, Tong Li, Zhaorui Zhang, Feng Zhang
The Computer Journal, 2025BLIP-Eye: Vision-Language Pre-training Classification Model for Eye Diseases Using OCT Scans
Khalid Ayed Alharthi, Saja A Alshahrani, Rasha O Alshahrani, Rahaf A Alshahrani, Ali Alshahrani, Zhaorui Zhang
International Conference on AI in Medicine and Healthcare, 2025
2024:
A Compiler-Like Framework for Optimizing Cryptographic Big Integer Multiplication on GPUs
Zhuoran Ji, Jianyu Zhao, Zhaorui Zhang, Jiming Xu, Shoumeng Yan, Lei Ju
57th IEEE/ACM International Symposium on Microarchitecture, (Micro’24), 2024
(CCF-A Conference for Computer Architecture)Versatile Datapath Soft Error Detection on the Cheap for HPC Applications
Yafan Huang, Sheng Di, Zhaorui Zhang, Xiaoyi Lu, Guanpeng Li
The International Conference for High-Performance Computing, Networking, Storage and Analysis, (SC’24), 2024
(CCF-A Conference for High-Performance Computing)FedFa: A Fully Asynchronous Training Paradigm for Federated Learning
Haotian Xu*, Zhaorui Zhang, Sheng Di, Benben Liu, Alharthi Khalid, Jiannong Cao
International Joint Conference on Artificial Intelligence, (IJCAI’24), 2024
(CCF-A Conference for AI Systems)An Optimized Error-controlled MPI Collective Framework Integrated with Lossy Compression
Jiajun Huang, Sheng Di, Xiaodong Yu, Yujia Zhai, Zhaorui Zhang, Jinyang Liu, Ken Raffenetti, Hui Zhou, Kai Zhao, Zizhong Chen, Franck Cappello, Yanfei Guo, Rajeev Thakur
38th IEEE International Parallel & Distributed Processing Symposium, (IPDPS’24), 2024
(CCF-B Conference for Distributed and Parallel Systems)Accelerating High-Precision Integer Multiplication used in Cryptosystems with GPUs
Zhuoran Ji, Zhaorui Zhang, Jiming Xu, Lei Ju
ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, (PPoPP’24), 2024
(CCF-A Conference for Distributed and Parallel Systems)
2023 and Before:
MIPD: An Adaptive Gradient Sparsification Framework for Distributed DNNs Training
Zhaorui Zhang, Cho-Li Wang
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2022
(CCF-A Journal for Distributed and Parallel Systems)Momentum-Driven Adaptive Synchronization Model for Distributed DNN Training on HPC Clusters
Zhaorui Zhang, Zhuoran Ji, Cho-Li Wang
Journal of Parallel and Distributed Computing (JPDC), 2022
(CCF-B Journal for Distributed and Parallel Systems)Efficient Parameter Update Strategy for Distributed Deep Learning Systems Zhaorui Zhang
HKU Theses Online (HKUTO), 2021SaPus: Self-Adaptive Parameter Update Strategy for DNN Training on Multi-GPU Clusters
Zhaorui Zhang, Cho-Li Wang
IEEE Transactions on Parallel and Distributed Systems (TPDS), 2021
(CCF-A Journal for Distributed and Parallel Systems)Development Report on National High-Performance Computing Environment
Xuebin Chi, Liping Liu, Yangang Wang, Zhaorui Zhang, etc.
Book, published by Science PressFPGA-based High-Performance Collision Detection: An Enabling Technique for Image-Guided Robotic Surgery
Zhaorui Zhang, Xin Y, Liu B, Li WXY, Lee K.H., Ng C.F., Stoyanov D, Cheung RCC, Kwok KW
Frontiers in Robotics and AIAn Application Specific Instruction Set Processor (ASIP) for Adaptive Filters in Neural Prosthetics
Yao Xin, Will X. Y. Li, Zhaorui Zhang, Ray C. C. Cheung, Dong Song, Theodore W. Berger
IEEE/ACM Transactions on Computational Biology and Bioinformatics, (TCBB)