👀 About me
I am Zhenyu Yang (杨振宇), a fourth-year Ph.D. student (2022-2027) at the State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), Institute of Automation, Chinese Academy of Sciences, advised by Prof. Changsheng Xu. Previously, I earned my Bachelor’s degree from Beijing University of Posts and Telecommunications in 2022.
My research interests include 1. Streaming Video Understanding, 2. Multimodal Large Language Models, 3. Multimodal Retrieval. I have previously worked as a research intern with the Tencent Hunyuan team, the Kuaishou Keye team, and the 360 AI Department. I welcome collaboration and am always open to discussing research opportunities—feel free to reach out via email!!!
🔥 News
- 2025.11: 🏆🏆 Congratulations to our “Multi-View Captioning with Semantic Delta Re-Ranking for Zero-Shot Composed Video Retrieval” for winning the Best Paper Award at ICIG 2025!
- 2025.09: 🎉🎉 Our paper “LiveStar: Live Streaming Assistant for Real-World Online Video Understanding” about streaming Video-LLMs has been accepted to NeurIPS 2025!
- 2025.08: 🎉🎉 Our paper “StreamingCoT: A Dataset for Temporal Dynamics and Multimodal Chain-of-Thought Reasoning in Streaming VideoQA” has been accepted to ACM MM 2025 Datasets!
- 2025.07: 🎉🎉 I was supported the CIE-Tencent Doctoral Research Incentive Project, a competitive grant awarded to only 23 recipients nationwide, along with a research fund of 100,000 RMB.
📝 Publications
🔥 [NeurIPS’2025] Poster
LiveStar: Live Streaming Assistant for Real-World Online Video Understanding
Zhenyu Yang, Kairui Zhang, Yuhang Hu, Bing Wang, Shengsheng Qian, Bin Wen, Fan Yang, Tingting Gao, Weiming Dong, Changsheng Xu
🚀 [ICLR’2025] Spotlight
SVBench: A Benchmark with Temporal Multi-Turn Dialogues for Streaming Video Understanding
Zhenyu Yang, Yuhang Hu, Zemin Du, Dizhan Xue, Shengsheng Qian, Jiahong Wu, Fan Yang, Weiming Dong, Changsheng Xu
[Code] [Project] [Paper] [Dataset] [Model] [Leaderboard] [Submission] [中文解读]
🏆 [SIGIR’2024] Best Paper Honorable Mention
LDRE: LLM-based Divergent Reasoning and Ensemble for Zero-Shot Composed Image Retrieval
Zhenyu Yang, Dizhan Xue, Shengsheng Qian, Weiming Dong, Changsheng Xu
🎉 [ACM MM’2024] Poster
Semantic Editing Increment Benefits Zero-Shot Composed Image Retrieval
Zhenyu Yang, Shengsheng Qian, Dizhan Xue, Jiahong Wu, Fan Yang, Weiming Dong, Changsheng Xu
🎉 [ACM MM’2025] Poster
Yuhang Hu, Zhenyu Yang, Shihan Wang, Shengsheng Qian, Bin Wen, Fan Yang, Tingting Gao, Changsheng Xu
🏆 [ICIG’2025] Best Paper Award
Multi-View Captioning with Semantic Delta Re-Ranking for Zero-Shot Composed Video Retrieval
Zhixiang Ding, Lilong Liu, Zhenyu Yang, Shengsheng Qian
🎖 Honors and Awards
- Best Paper Honorable Mention (5/791), SIGIR, 2024
- Best Paper Award, ICIG, 2025
- Spotlight Paper (~3.27%), ICLR, 2025
- CIE-Tencent Doctoral Research Incentive Project / 混元学者 (中国电子学会-腾讯博士生科研激励计划), 2025
- National Scholarship, Ministry of Education, China, 2024
- Outstanding Graduate, Beijing, 2022
- Outstanding Graduate, Beijing University of Posts and Telecommunications, 2022
- First-Class Scholarship, Beijing University of Posts and Telecommunications, 2020/2021
- First Prize in American Mathematical Contest in Modeling (MCM), Top 6.7% Globally, 2020
📖 Educations
- 2022.09 - 2027.06: Ph.D, State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing. Major: Computer Applied Technology.
- 2018.09 - 2022.06: Undergraduate, School of Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications, Beijing. Major: Intelligent Science and Technology.
🙋 Services
- Conference Reviewer: CVPR 2026, ICLR 2026, AAAI 2026, NeurIPS 2025, ICCV 2025, ACML 2025, etc.
- Journal Reviewer: IEEE Transactions on Image Processing (TIP), Transactions on Multimedia Computing Communications and Applications (TOMM), Neurocomputing, Pattern Recognition.