Qing Qu Receives 2025 Google Research Scholar Award for Developing Efficient Generative AI Models

Prof. Qu’s research aims to reduce the cost and complexity of fine-tuning and training large-scale foundation models, while advancing the accessibility and sustainability of high-performance generative AI.
Qing Qu portrait
Qing Qu. Photo: Jero Lopera

Qing Qu, Assistant Professor in the Department of Electrical Engineering and Computer Science at the University of Michigan, has been named as one of the recipients of the Google Research Scholar Award in machine learning. The award recognizes his pioneering project, “Harnessing Compressible Dynamics for Adapting & Training Foundation Models,” which addresses key efficiency challenges in training and fine-tuning foundation models while offering rigorous mathematical guarantees. 

“Training and deploying modern machine learning models demands significant computational resources, raising concerns about GPU shortages and rising energy consumption—especially with the empirical scaling law suggesting that exponential growth in model size and training data yields only marginal performance gains,” explained Qu. “This project aims to address the challenges by harnessing the intrinsic low-dimensional structures in data and models to greatly improve the efficiency, adaptability, and scalability of large-scale foundation models for generative AI.”

Panel of three graphs.
An illustration of invariant low-dimensional subspaces across learning dynamics in deep low-rank adaptation (Deep LoRA) of language models [1].

Foundation models such as LLaMA and Gemma, which form the backbone of modern natural language processing and vision applications, require massive computational resources. Qu’s project aims to tackle these challenges head-on by leveraging mathematical insights into low-dimensional structures hidden in model weights and learning dynamics to develop novel low-rank fine-tuning and compressive training methodologies.

Qu will build upon his recent innovation, “Deep LoRA,” which was presented as an oral presentation at ICML’24 [1] and won best student poster award at MMLS’24, to develop rank-agnostic, task-adaptive low-rank adaptation (LoRA) methods for fine-tuning large language models. These techniques aim to simplify the hyperparameter tuning process and boost performance in low-data scenarios. The research also explores training at the “Edge of Stability” (EoS)—a regime that uses large learning rates to foster better generalization and faster convergence. The second thrust will develop a novel training paradigm using “BLAST matrices” (Block-Level Adaptive Structured matrices) [4]. These matrices surpass conventional low-rank approximations, providing faster computation and superior expressivity. 

“I am deeply grateful to Google Research for their generous support. Together, we aim to reduce the cost and complexity of training large-scale models, while making high-performance generative AI more accessible and sustainable,” said Qu.

Qu has presented related research at several leading machine learning conferences, including ICML [1], AISTATS [2], ICLR [3], NeurIPS [4], and the Google Theory and Practice of Foundation Models Workshop. Together with his students and Prof. Laura Balzano, he recently wrote a survey paper [5] on the related topic. His work has been previously recognized with an NSF CAREER Award in 2022, an Amazon AWS AI Award in 2023, and the U-M Chinese Heritage & Scholarship Junior Faculty Award in 2025.


[1] Can Yaras, Peng Wang, Laura Balzano, and Qing Qu. Compressible dynamics in deep overparameterized low-rank learning & adaptation. In Forty-first International Conference on Machine Learning, 2024. (Oral, top 1.5%, MMLS Best Poster Award) 

[2] Soo Min Kwon, Zekai Zhang, Dogyoon Song, Laura Balzano, and Qing Qu. Efficient low-dimensional compression of overparameterized models. In Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, pages 1009–1017, 2024. 

[3] Avrajit Ghosh, Soo Min Kwon, Rongrong Wang, Saiprasad Ravishankar, and Qing Qu. Learning dynamics of deep matrix factorization beyond the edge of stability. In The Thirteenth International Conference on Learning Representations, 2025. 

[4] Changwoo Lee, Soo Min Kwon, Qing Qu, and Hun-Seok Kim. BLAST: Block-level adaptive structured matrices for efficient deep neural network inference. In The Thirty-eighth Annual Conference on Neural Information Processing Systems, 2024. 

[5] Laura Balzano, Tianjiao Ding, Benjamin D. Haeffele, Soo Min Kwon, Qing Qu, Peng Wang, Zhangyang Wang, Can Yaras. An Overview of Low-Rank Structures in the Training and Adaptation of Large Models. Arxiv Preprint arXiv:2503.19859, 2025. (Authors are listed alphabetically)