Shuai Kyle Zheng

I like training deep neural nets that compress large-scale datasets and perform like generalists.

I enjoy building AI systems, particularly in Computer Vision and Language. I believe the full-stack AI system view, in particularly, I think a working AI application is not only just the model, but also about the data and the whole pipeline.

I did my PhD in deep learning and Computer Vision with Prof. Philip H. S. Torr at Oxford. During my PhD I also interned at Baidu IDL and visited Prof. Carsten Rother's group. I started my research in Computer at National Laboratory of Pattern Recognition. I was a member of technical staff Research Scientist at eBay and played an role in the development of the Shoping chatbot, shop-the-look, sneaker-timeline, and camera-as-commerce-platform. I was also founding member at Dawnlight, and working on human activity recognition system on low-powered devices. I am currently a research scientist at GM Cruise, and working on multi-modal multi-task learning and foundational Vision-Language-Model for autonomous driving.

Timeline

2021 - Present

Staff Research Scientist. Currently at GM Cruise, focusing on Vision-Language Model (VLM), Multi-Task Learning for Autonomous Vehicle. 🚘🤖➡️🛣️

2021

Senior Research Engineer at Verkada. Led the efforts on computer vision problems such as tracking, license plate recognition. 🧠📹🔍

2019 - 2021

Founding Member of Dawnlight. Led the efforts in human activity recognition in low-powered device. 🛏️🧠👀📈

2015 - 2017

Member of Technical Staff & Research Scientist at eBay. 📸🧠👗

2011 - 2016

PhD at Oxford University under the supervision of Prof. Philip H. S. Torr, focusing on deep learning and computer vision.

2014

Internship at Baidu Institute of Deep Learning, working on recurrent neural networks, semantic segmentation and depth estimation.

2013

Visiting Prof. Carsten Rother's group at TU Dresden.

2011

Researcher and Engineer at VisionHacker / Megvii.

2008 - 2011

Master at National Laboratory of Pattern Recognition under the supervisor of Kaiqi Huang and Tieniu Tan, Chinese Academy of Sciences, focusing on object detection, image classification, and person re-identification recognition.

Projects

Here are some of my projects and research:

CRFASRNN - A semantic image segmentation system using neural networks and end-to-end trainable conditional random fields.
GP-GAN - A image inpainting system using Gaussian-Poisson Generative Adversarial Network.
DeepGuidedFilter - A end-to-end trainable guided filter with neural networks for dense image prediction.
ModaNet - A Large-Scale Street Fashion Dataset with Polygon Annotations
PROFIT - A drop-in optimizer for temporal multi-task learning.
ZSD-YOLO - Empowering YOLO-v5 object detection with zero-shot learning capability.

Recent Peer-Reviewed Publications

Anirudh S. Chakravarthy, Shuai Kyle Zheng, Xin Huang, Sachithra Hemachandra, Xiao Zhang, Yuning Chai, Zhao Chen. PROFIT: A Specialized Optimizer for Deep Fine Tuning. NeurIPS 2025 (poster). Preprint: arXiv:2412.01930.

Shuai Kyle Zheng

Timeline

Projects

Recent Peer-Reviewed Publications

Talks & Presentations