[NeurIPS 2025 Spotlight] Think or not think: A study of explicit thinking in rule-based visual reinforcement fine-tuning 
			Ming Li, Jike Zhong, Shitian Zhao, Yuxiang Lai, Haoquan Zhang, Wang Bill Zhu, Kaipeng Zhang† 
			
 
			
			
            [NeurIPS 2025]  Sekai: A Video Dataset towards World Exploration 
            Zhen Li, Chuanhao Li, Xiaofeng Mao, Shaoheng Lin, Ming Li, Shitian Zhao, xu Zhao Pan, Xinyue Li, Yukang Feng, Jianwen Sun, Zizhen Li, Fanrui Zhang, Jiaxin Ai, Zhixiang Wang, Yuwei Wu†, Tong He, Yunde Jia, Kaipeng Zhang† 
			 
			VIDEO 
			
			
			
            [NeurIPS 2025]  Neural-Driven Image Editing 
            Pengfei Zhou, Jie Xia, Xiaopeng Peng, Wangbo Zhao, Zilong Ye, Zekai Li, Suorong Yang, Jiadong Pan, Yuanxiang Chen, Ziqiao Wang, Kai Wang, Qian Zheng, Xiaojun Chang, Gang Pan, Shurong Dong†, Kaipeng Zhang† , Yang You
			 
			
			
            [NeurIPS 2025]  REPA Works Until It Doesn’t: Early-Stopped, Holistic Alignment Supercharges Diffusion Training 
            Ziqiao Wang, Wangbo Zhao, Yuhao Zhou, Zekai Li, Zhiyuan Liang, Mingjia Shi, Xuanlei Zhao, Pengfei Zhou, Kaipeng Zhang† , Zhangyang Wang, Kai Wang†, Yang You
			 
			
			
			
			
            [EMNLP 2025]  InMind: Evaluating LLMs in Capturing and Applying Individual Human Reasoning Styles 
            Zizhen Li, Chuanhao Li, Yibin Wang, Qi Chen, Diping Song, Yukang Feng, Jianwen Sun, Jiaxin Ai, Fanrui Zhang, Mingzhu Sun, Kaipeng Zhang† 
 
			 
			
			
            [ICCV 2025]  ProJudge: A Multi-Modal Multi-Discipline Benchmark and Instruction-Tuning Dataset for MLLM-based Process Judges 
            Jiaxin Ai, Pengfei Zhou, xu Zhao Pan, Ming Li, Fanrui Zhang, Zizhen Li, Jianwen Sun, Yukang Feng, Baojin Huang, Zhongyuan Wang†, Kaipeng Zhang† 
 
			 
			
			
			
			
			
			
            [ICCV 2025]  LiT: Delving into a Simple Linear Diffusion Transformer for Image Generation 
            Jiahao Wang, Ning Kang, Lewei Yao, Mengzhao Chen, Chengyue Wu, Songyang Zhang, Shuchen Xue, Yong Liu, Taiqiang Wu, Xihui Liu, Kaipeng Zhang , Shifeng Zhang, Wenqi Shao†, Zhenguo Li†, Ping Luo
 
			 
			
			
			
			
			
			
			
			
			
			
			
			
			
			
            [CVPR 2025 Oral]  OpenING: A Comprehensive Benchmark for Judging Open-ended Interleaved Image-Text Generation 
            Pengfei Zhou*, Xiaopeng Peng*, Jiajun Song, Chuanhao Li, Zhaopan Xu, Yue Yang, Ziyao Guo, Hao Zhang, Yuqi Lin, Yefei He, Lirui Zhao, Shuo Liu, Tianhua Li, Yuxuan Xie, Xiaojun Chang, Yu Qiao, Wenqi Shao,, Ping Luo, Kaipeng Zhang†  
			 
			
			
			
			
			
            [ICLR 2025]  MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models 
            Fanqing Meng*, Chuanhao Li*, Jin Wang*, Quanfeng Lu, Hao Tian, Tianshuo Yang, Jiaqi Liao, Xizhou Zhu, Jifeng Dai, Yu Qiao, Ping Luo, Kaipeng Zhang† , Wenqi Shao†
 
			 
		
			
            [NeurIPS 2024 Spotlight]  ConvBench: A Multi-Turn Conversation Evaluation Benchmark with Hierarchical Capability for Large Vision-Language Models 
            Shuo Liu, Kaining Ying, Hao Zhang, Yue Yang, Yuqi Lin, Tianle Zhang, Chuanhao Li, Yu Qiao, Ping Luo, Wenqi Shao†, Kaipeng Zhang†  
			 
			
			
			
			
            [NeurIPS 2024]  Rethinking Human Evaluation Protocol for Text-to-Video Models: Enhancing Reliability,Reproducibility, and Practicality 
            Tianle Zhang, Langtian Ma, Yuchen Yan, Yuchen Zhang, Kai Wang, Yue Yang, Ziyao Guo, Wenqi Shao, Yang You, Yu Qiao, Ping Luo, Kaipeng Zhang†  
			 
			
			
            [NeurIPS 2024]  Needle In A Multimodal Haystack 
            Weiyun Wang, Shuibo Zhang, Yiming Ren, Yuchen Duan, Tiantong Li, Shuo Liu, Mengkang Hu, Zhe Chen, Kaipeng Zhang , Lewei Lu, Xizhou Zhu, Ping Luo, Yu Qiao, Jifeng Dai, Wenqi Shao†, Wenhai Wang†
			 
			
			
            [NeurIPS 2024]  Lumina-Next: Making Lumina-T2X Stronger and Faster with Next-DiT 
            Le Zhuo*, Ruoyi Du*, Han Xiao*, Yangguang Li*, Dongyang Liu*, Rongjie Huang*, Wenze Liu*, Lirui Zhao, Fu-Yun Wang, Zhanyu Ma, Xu Luo, Zehan Wang, Kaipeng Zhang , Xiangyang Zhu, Si Liu, Xiangyu Yue, Dingning Liu, Wanli Ouyang, Ziwei Liu, Yu Qiao†, Hongsheng Li†, Peng Gao†
			 
		  
          
			
			
			
            [ICML 2024]  MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI 
             Kaining Ying*, Fanqing Meng*, Jin Wang*, Zhiqian Li, Han Lin, Yue Yang, Hao Zhang, Wenbo Zhang, Yuqi Lin, Shuo Liu, Jiayi Lei, Quanfeng Lu, Cunjian Chen, Peng Xu, Renrui Zhang, Haozhe Zhang, Peng Gao, Yali Wang, Yu Qiao, Ping Luo, Kaipeng Zhang†  and Wenqi Shao†
			 
			
			
            [ICML 2024]  SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models 
            Peng Gao*†, Renrui Zhang*, Chris Liu*, Longtian Qiu*, Siyuan Huang*, Weifeng Lin*, Shitian Zhao, Shijie Geng, Ziyi Lin, Peng Jin, Kaipeng Zhang , Wenqi Shao, Chao Xu, Conghui He, Junjun He, Hao Shao, Pan Lu, Hongsheng Li† and Yu Qiao
			 
		
			
			
			
	
			
			
			
			
			
      
	  
			
			
			
			
			
			
			
			
			
			
			[Journal Papers]
			
			
			
			
			
			
			
			
			
			
			
            [TBigData 2024] Tiny LVLM-eHub: Early Multimodal Experiments with Bard 
			Wenqi Shao*, Yutao Hu*, Peng Gao*, Meng Lei*, Kaipeng Zhang , Fanqing Meng, Peng Xu, Siyuan Huang, Hongsheng Li, Yu Qiao†, Ping Luo†
			 
			
		  
			
	
			
			
			
			[Tutorial]
			 
    
    
      
        Education 
        
		
		
          
          
            
            
              
                Ph.d.  in CS, The University of Tokyo, Tokyo, Japan
                Apr. 2019 - Mar. 2022
               
             
    
           
         
		
        
          
          
            
            
              
                M.S.  in CS, National Taiwan University, Taipei, Taiwan
                Sep. 2016 - Aug. 2018
               
             
    
           
         
		
          
          
            
            
              
                B.Eng.  in CS, Donghua University, Shanghai, China
                Sep. 2012 - July 2016
               
             
    
           
         
        
	  
	  
      
        Selected Awards and Competitions 
        
		 WAIC Young Outstanding Paper Award , 2022Emotion Recognition in the Wild : Engagement Prediction (ICMI 2019 Grand Challenge), 3rd placeEmotion Recognition in the Wild : Group-based Cohesion Prediction (ICMI 2019 Grand Challenge), 2nd placeDisguised Faces in the Wild Challenge  (in conjunction with CVPR 2018), 1st placeEmotion Recognition in the Wild : Group-level emotion recognition (ICMI 2018 Grand Challenge), 2nd placeEmotion Recognition in the Wild : Group-level emotion recognition (ICMI 2017 Grand Challenge), 1st placeChaLearn Looking at People Challenge : Accessories Classification (in conjunction with CVPR 2016), 1st placeChaLearn Looking at People Challenge : Smile and Gender Classification (in conjunction with CVPR 2016), 1st place 
     
	
	
       
        Academic Service 
        
			 
     
	
    
        
        
            Work Experience 
            
				
            
                
                Researcher 
                Shanghai AI Lab
				OpenGVLab
				Shanghai, China
                May. 2022 - Present
             
            
            
                
                Researcher 
                SenseTime
				Research Institute
				Shenzhen, China
                Sept. 2018 - Mar. 2019
             
			 
			
			
			
			
                
                Intern 
                MSRA
                Visual Computing Group
				Beijing, China
                Jan. 2018 - Jul. 2018
             
          
			
                
                Consultant 
                ULSee
                Face Team
				Hangzhou, China
                Oct. 2016 - Mar. 2018
				
             
             
			
			
			
			
                
                Intern 
                Tencen
                AI Lab & AI Advertisement Department
				Shenzhen, China
                Jul. 2017 - Aug. 2017
                Sep. 2020 - Feb. 2021
				
             
		    
            
            
                    
                    Visiting Student 
                    Shenzhen Institutes of Advanced Technology
                    Multimedia Research Center
					Shenzhen, China
                    Jul. 2015 - Aug. 2016