
Shuai Kyle Zheng
I like training deep neural nets that compress large-scale datasets and perform like generalists.
I enjoy building AI systems, particularly in Computer Vision and Language. I believe the full-stack AI system view, in particularly, I think a working AI application is not only just the model, but also about the data and the whole pipeline.
I did my PhD in deep learning and Computer Vision with Prof. Philip H. S. Torr at Oxford. During my PhD I also interned at Baidu IDL and visited Prof. Carsten Rother's group. I started my research in Computer at National Laboratory of Pattern Recognition. I was a member of technical staff Research Scientist at eBay and played an role in the development of the Shoping chatbot, shop-the-look, sneaker-timeline, and camera-as-commerce-platform. I was also founding member at Dawnlight, and working on human activity recognition system on low-powered devices. I am currently a research scientist at GM Cruise, and working on multi-modal multi-task learning and foundational Vision-Language-Model for autonomous driving.
Timeline
Senior Research Scientist II. Currently at Cruise, focusing on Vision-Language Model (VLM), Multi-Task Learning for Autonomous Vehicle. ππ€β‘οΈπ£οΈ
Senior Research Engineer at Verkada. Led the efforts on computer vision problems such as tracking, license plate recognition. π§ πΉπ
Founding Member of Dawnlight. Led the efforts in human activity recognition in low-powered device. ποΈπ§ ππ
Member of Technical Staff & Research Scientist at eBay. πΈπ§ π
PhD at Oxford University under the supervision of Prof. Philip H. S. Torr, focusing on deep learning and computer vision.
Internship at Baidu Institute of Deep Learning, working on recurrent neural networks, semantic segmentation and depth estimation.
Visiting Prof. Carsten Rother's group at TU Dresden.
Researcher and Engineer at VisionHacker / Megvii.
Master at National Laboratory of Pattern Recognition under the supervisor of Kaiqi Huang and Tieniu Tan, Chinese Academy of Sciences, focusing on object detection, image classification, and person re-identification recognition.
Projects
Here are some of my projects and research:
- CRFASRNN - A semantic image segmentation system using neural networks and end-to-end trainable conditional random fields.
- GP-GAN - A image inpainting system using Gaussian-Poisson Generative Adversarial Network.
- DeepGuidedFilter - A end-to-end trainable guided filter with neural networks for dense image prediction.
- ModaNet - A Large-Scale Street Fashion Dataset with Polygon Annotations
- PROFIT - A drop-in optimizer for temporal multi-task learning.
- ZSD-YOLO - Empowering YOLO-v5 object detection with zero-shot learning capability.