👀 About me

I am Zhenyu Yang (杨振宇), a fourth-year Ph.D. student (2022-2027) at the State Key Laboratory of Multimodal Artificial Intelligence Systems (MAIS), Institute of Automation, Chinese Academy of Sciences, advised by Prof. Changsheng Xu. Previously, I earned my Bachelor’s degree from Beijing University of Posts and Telecommunications in 2022.

My research interests include 1. Streaming Video Understanding, 2. Multimodal Large Language Models, 3. Multimodal Retrieval. I have previously worked as a research intern with the Tencent Hunyuan team, the Kuaishou Keye team, and the 360 AI Department. I welcome collaboration and am always open to discussing research opportunities—feel free to reach out via email!!!

🔥 News

📝 Publications

NeurIPS 2025
sym

🔥 [NeurIPS’2025] Poster

LiveStar: Live Streaming Assistant for Real-World Online Video Understanding

Zhenyu Yang, Kairui Zhang, Yuhang Hu, Bing Wang, Shengsheng Qian, Bin Wen, Fan Yang, Tingting Gao, Weiming Dong, Changsheng Xu

[Code] [Project] [Paper] [中文解读]

ICLR 2025
sym

🚀 [ICLR’2025] Spotlight

SVBench: A Benchmark with Temporal Multi-Turn Dialogues for Streaming Video Understanding

Zhenyu Yang, Yuhang Hu, Zemin Du, Dizhan Xue, Shengsheng Qian, Jiahong Wu, Fan Yang, Weiming Dong, Changsheng Xu

[Code] [Project] [Paper] [Dataset] [Model] [Leaderboard] [Submission] [中文解读]

SIGIR 2024
sym

🏆 [SIGIR’2024] Best Paper Honorable Mention

LDRE: LLM-based Divergent Reasoning and Ensemble for Zero-Shot Composed Image Retrieval

Zhenyu Yang, Dizhan Xue, Shengsheng Qian, Weiming Dong, Changsheng Xu

[Code] [Paper] [Video]

ACM MM 2024
sym

🎉 [ACM MM’2024] Poster

Semantic Editing Increment Benefits Zero-Shot Composed Image Retrieval

Zhenyu Yang, Shengsheng Qian, Dizhan Xue, Jiahong Wu, Fan Yang, Weiming Dong, Changsheng Xu

[Code] [Paper]

ACM MM 2025
sym

🎉 [ACM MM’2025] Poster

StreamingCoT: A Dataset for Temporal Dynamics and Multimodal Chain-of-Thought Reasoning in Streaming VideoQA

Yuhang Hu, Zhenyu Yang, Shihan Wang, Shengsheng Qian, Bin Wen, Fan Yang, Tingting Gao, Changsheng Xu

[Code] [Paper]

ICIG 2025
sym

🏆 [ICIG’2025] Best Paper Award

Multi-View Captioning with Semantic Delta Re-Ranking for Zero-Shot Composed Video Retrieval

Zhixiang Ding, Lilong Liu, Zhenyu Yang, Shengsheng Qian

[Code] [Project] [Paper]

🎖 Honors and Awards

  • Best Paper Honorable Mention (5/791), SIGIR, 2024
  • Best Paper Award, ICIG, 2025
  • Spotlight Paper (~3.27%), ICLR, 2025
  • CIE-Tencent Doctoral Research Incentive Project / 混元学者 (中国电子学会-腾讯博士生科研激励计划), 2025
  • National Scholarship, Ministry of Education, China, 2024
  • Outstanding Graduate, Beijing, 2022
  • Outstanding Graduate, Beijing University of Posts and Telecommunications, 2022
  • First-Class Scholarship, Beijing University of Posts and Telecommunications, 2020/2021
  • First Prize in American Mathematical Contest in Modeling (MCM), Top 6.7% Globally, 2020

📖 Educations

  • 2022.09 - 2027.06: Ph.D, State Key Laboratory of Multimodal Artificial Intelligence Systems, Institute of Automation, Chinese Academy of Sciences, Beijing. Major: Computer Applied Technology.
  • 2018.09 - 2022.06: Undergraduate, School of Computer Science (National Pilot Software Engineering School), Beijing University of Posts and Telecommunications, Beijing. Major: Intelligent Science and Technology.

🙋 Services

  • Conference Reviewer: CVPR 2026, ICLR 2026, AAAI 2026, NeurIPS 2025, ICCV 2025, ACML 2025, etc.
  • Journal Reviewer: IEEE Transactions on Image Processing (TIP), Transactions on Multimedia Computing Communications and Applications (TOMM), Neurocomputing, Pattern Recognition.