Kaiyang Zhou is an Assistant Professor in the Department of Computer Science at Hong Kong Baptist University. His research interests include machine learning, computer vision, and multimodality. He has published an edited book on Large Vision-Language Models and more than 50 journal and conference papers in top-tier venues, including TPAMI, TIP, IJCV, CVPR, ICCV, ECCV, NeurIPS, ICLR, ICML, and AAAI. His work has been cited over 19,000 times. He is an associate editor of the International Journal of Computer Vision and regularly serves as an area chair for prestigious conferences such as CVPR, ECCV, NeurIPS, ICML, and ICLR. Before joining HKBU, he was a postdoc at Nanyang Technological University, working with Prof. Ziwei Liu and Prof. Chen Change Loy. He received his PhD in Computer Science from the University of Surrey, under the supervision of Prof. Tao Xiang. During PhD, he was fortunate to have an internship at Samsung AI Center Cambridge.
News
- Dec 2025 Invited to serve as area chair of ECCV 2026.
- Nov 2025 Invited to serve as area chair of ICML 2026.
- Sep 2025 Our edited book Large Vision-Language Models is online.
- Aug 2025 Invited to serve as area chair of ICLR 2026.
- Aug 2025 Invited to serve as area chair of CVPR 2026.
- Jul 2025 Invited to serve as area chair of AAAI 2026.
Research
Generally interested in machine learning and computer vision, with a goal of building general-purpose intelligence that can see, reason, and act safely and reliably in the unpredictable world. Currently focusing on vision-language models, multimodality, agents, and embodied AI.
Recent Papers
The papers shown below give an overview of topics I am working on.
- Fine-tuning Quantized Neural Networks with Zeroth-order Optimization ICLR, 2026 [paper] [code]
- Streaming Video Instruction Tuning arXiv, 2025 [paper] [code] [dataset]
- Learning to Think Fast and Slow for Visual Language Models arXiv, 2025 [paper] [code] [model]
- Visionary-R1: Mitigating Shortcuts in Visual Reasoning with Reinforcement Learning arXiv, 2025 [paper] [code] [model]
- Measuring Epistemic Humility in Multimodal Large Language Models arXiv, 2025 [paper] [code] [dataset]
- Mitigating Hallucination in Multimodal LLMs with Layer Contrastive Decoding NeurIPS Workshop on Multimodal Algorithmic Reasoning, 2025 [paper] [code]
- Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation ICCV (Highlight), 2025 [paper] [code]
Selected Publications
- Conditional Prompt Learning for Vision-Language Models CVPR, 2022 [paper] [code]
- Learning to Prompt for Vision-Language Models IJCV, 2022 [paper] [code]
- Domain Generalization with MixStyle ICLR, 2021 [paper] [code]
- Omni-Scale Feature Learning for Person Re-Identification ICCV, 2019 [paper] [code] [model]
- Deep Reinforcement Learning for Unsupervised Video Summarization with Diversity-Representativeness Reward AAAI, 2018 [paper] [code]
Team
I am always recruiting self-motivated PhD students/research assistants interested in LLM/VLM/Agents/Robotics. Ideal candidates should have a strong background in ML/CV/NLP, solid coding skills, and prior research experience. If you are passionate about doing cutting-edge AI research with us, please send me an email with your CV, transcripts, relevant publications or projects, and research statement (if any). For postdoc application, you are also welcome to reach out to me but funding depends on availability.
PhD Students
- Jiaer Xia (2024 - Present)
- Sifeng Shang (2024 - Present)
- Jiayi Zhou (2025 - Present)
- Chenyu Lin (2025 - Present)
Research Assistants
- Linchao Pan (2025 - Present)
- Haichen He (2025 - Present)
Alumni
- Yu Tong (RA 2025)
- Bingkui Tong (RA 2024-25, now PhD at MBZUAI)
Teaching
Services
- Associate Editor, International Journal of Computer Vision (IJCV) (2023 - Present)
- Guest Editor, IJCV Special Issue on Visual Domain Generalization in Real-World Applications (2024)
- Guest Editor, IJCV Special Issue on The Promises and Dangers of Large Vision Models (2023)
- Area Chair, International Conference on Machine Learning (ICML) (2025, 2026)
- Area Chair, International Conference on Learning Representations (ICLR) (2025, 2026)
- Area Chair, Neural Information Processing Systems (NeurIPS) (2024, 2025)
- Area Chair, Computer Vision and Pattern Recognition (CVPR) (2024, 2026)
- Area Chair, European Conference on Computer Vision (ECCV) (2024, 2026)
- Area Chair, AAAI Conference on Artificial Intelligence (AAAI) (2023 - 2026)
- Area Chair, British Machine Vision Conference (BMVC) (2022, 2024)
- Organizer, CVPR 2025 Workshop on Domain Generalization
- Organizer, ECCV 2024 Workshop on Green Foundation Models
- Organizer, CVPR 2024 Workshop on Prompting in Vision
- Organizer, CVPR 2023 Tutorial on Prompting in Vision
- Organizer, ICLR 2023 Workshop on What Do We Need for Successful Domain Generalization
- Organizer, The AI Talks
Awards
- 2025 CoOp received WAIC Youth Outstanding Paper Nomination Award
- 2024 HKBU Research Excellence Paper Award
- 2023 World’s Top 2% Scientists
- 2022 CoCoOp received Top-100 Most Cited AI Paper in 2022
- 2022 ECCV 2022 Outstanding Reviewer
- 2021 ICCV 2021 Outstanding Reviewer
- 2021 AAAI 2021 Top 25% of Program Committee Members
Talks
- 2026 IAPR/IEEE Winter School on Biometrics 2026
- 2025 University of Macau
- 2025 HKBU-NVIDIA Joint Symposium 2025
- 2025 Southern University of Science and Technology
- 2025 IAPR/IEEE Winter School on Biometrics 2025
- 2024 HKBU-RIKEN AIP Joint Workshop on AI and ML
- 2024 Huawei Noah’s Ark Lab
- 2023 Vision and Learning Seminar (VALSE)
- 2023 University of Sydney
- 2023 AIGC-2023 Workshop on Trustworthy Foundation Models under Imperfect Data
- 2023 IJCAI-2023 Symposium Session on Medical Large Models
- 2023 University of Tokyo
- 2023 CVPR 2023 Tutorial on Prompting in Vision
- 2023 Chinese University of Hong Kong
- 2023 Hong Kong Baptist University
- 2023 Nanyang Technological University
- 2023 National University of Singapore
- 2021 NTU IET CV Workshop
- 2020 University of Surrey CVSSP
- 2020 NTU MMLab
- 2018 QMUL Intelligent Sensing Summer School