About me

I am currently a Research Assistant Professor at the Department of Computing, The Hong Kong Polytechnic University. I am also a member of Internet and Mobile Computing Laboratory (IMCL) Lab lead by Prof. Jiannong Cao.

I received my Ph.D. degree from the Department of Computer Science, The University of Hong Kong, and very fortunate to work with my supervisor Prof. Cho-Li Wang. Before studying at HKU, I used to work with Prof. Ray C. C. Cheung in CALAS Lab at City University of Hong Kong. I received my Bachelor’s degree in Computer Science from Xi’an Jiaotong University.

Email: zhaorui.zhang@polyu.edu.hk

Address: PQ748, Mong Man Wai Building, Hong Kong Polytechnic University

Opening!

If you are interested in LLMs fine-tuning and inference optimization, Agentic AI, we are actively looking for motivated colleagues/students at different levels (Postdoc, PhD, MSc, Undergraduate, etc.) to reach out and join us!

  • PhD: We have PhD positions open for 25/26 now. If you are interested in LLM/MLsys, Agentic AI, HPC, and Distributed Systems. Welcome to reach out and join us!

  • Postdoc Positions: Currently, we have Postdoc Positions related to the large-scale machine learning system, checkpointing design and optimization. Welcome to reach out and join us!

  • Capstone Project/URIS Project for Undergraduate Student at PolyU (FYP): We have positions openings every year for the Capstone Project and URIS project related to LLM, LLM inference, and Agentic AI which will collaborate with the industry. Welcome to reach out and join us! We will provide hands-on guidance for your work!

  • MSc Dissertaion/Thesis at PolyU: We have positions openings every year for the MSc Dissertaion/Thesis related to LLM, LLM inference, and Agentic AI which will collaborate with the industry. Welcome to reach out and join us! We will provide hands-on guidance for your work!

Research Interests:

LLMs, Agentic AI, MLSys, AI Infrastructure:

  • LLMs Fine-Tuning, Checkpointing, and Inference Optimization, Agentic AI: I am broadly interested in the building and optimization of AI systems (MLSys) from both sides of the system and machine learning algorithms based on a wide range of computing platforms (e.g., distributed, cloud, HPC, IoT, AIoT, and even quantum and photonic platforms) for emerging big data and AI applications, including distributed communication optimization, data compression, fault-tolerance, etc.

HPC, Distributed Systems, Cloud Computing, FPGAs:

  • I am also interested in high-performance computing (HPC), distributed systems, data reduction, cloud computing, error-bounded lossy compression, fault tolerance, and FPGA.

Recent Highlight:

  • 6 May. 2025, Our paper “GlobaZip: An Interactive, Efficient Distributed Compression-as-a-Service Platform with Optimized Data Compression Techniques” was accepted by the TPDS, IEEE Transactions on Parallel and Distributed Systems, Special Section on New Tools and Techniques for the Distributed Computing Continuum (DCC).

  • 5 May. 2025, Our paper “Recursive Confidence Training for Pseudo-Labeling Calibration in Semi-Supervised Few-Shot Learning” was accepted by the TIP, IEEE Transactions on Image Processing.

  • 13 April. 2025, Our paper “FedCSpc: A Cross-Silo Federated Learning System with Error-Bounded Lossy Parameter Compression” was accepted by the TPDS, IEEE Transactions on Parallel and Distributed Systems.

  • 7 April. 2025, Our Survey paper about the lossy compression “A Survey on Error-Bounded Lossy Compression for Scientific Datasets” has been accepted by the ACM Computing Survey. This work offers a comprehensive survey and analysis of lossy compression techniques for scientific data across various domains, including climate, seismic, medical, time series, point cloud, and AI data. Welcome to read and cite our survey paper!!!

  • 29 Jan. 2025, Invited to serve as the Technical Program Commitee of SC’ 25, “ The International Conference for High-Performance Computing, Networking, Storage and Analysis “.

  • 26 Nov. 2024, Our paper “StoreLLM: Energy Efficient Large Language Model Inference with Permanently Pre-stored Attention Matrices” has been accepted by ACM e-Energy 2025. The 16th ACM International Conference on Future and Sustainable Energy Systems. Rotterdam, Netherlands, June 17 - 20, 2025

  • 18 July 2024, Our paper “A Compiler-Like Framework for Optimizing Cryptographic Big Integer Multiplication on GPUs” has been accepted by Micro’ 2024. 57th IEEE/ACM International Symposium on Microarchitecture. November 2 – November 6, 2024. Austin, Texas, USA

  • 14 June 2024, Our paper “Versatile Datapath Soft Error Detection on the Cheap for HPC Applications” has been accepted by SC’2024 The International Conference for High-Performance Computing, Networking, Storage and Analysis. Atlanta, GA, USA, NOV 17–22

  • 17 April 2024, Our paper “FedFa: A Fully Asynchronous Training Paradigm for Federated Learning” has been accepted by IJCAI’ 2024 International Joint Conference on Artificial Intelligence. Jeju 03.08.24 - 09.08.24

  • 31 Jan. 2024, Our paper “An Optimized Error-controlled MPI Collective Framework Integrated with Lossy Compression” has been accepted by IPDPS’ 2024 38th IEEE International Parallel & Distributed Processing Symposium. May 27-31, 2024, San Francisco, California USA

  • 29 Nov. 2023, Our paper “Accelerating High-Precision Integer Multiplication used in Cryptosystems with GPUs” has been accepted by PPoPP’ 2024, ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming. 2024 March 2nd – March 6th, 2024, Edinburgh, UK

Academic Employment Experiences:

  • The Hong Kong Polytechnic University
    Research Assistant Professor in the Department of Computing