Hi, this is Xiaojie Xu(徐啸捷). I am an incoming Ph.D. student in Information Science and Technology at The University of Tokyo. My current research focuses on Generative AI, including image, video and multimodal generation. Representative works include:
- Multimodal Generation: POSTA(visually appealing movie poster generation from text, CVPR 25), PreGenie(MLLM Agents for text-image document understanding and presentation generation, EMNLP 25), Orchestrating Audio(MLLM Agents for long-video understanding and audio generation, EMNLP 25)
- Image/Video Generation: VBench++(benchmarking video generative models, T-PAMI 25), BEV to Street View(street-view images generation from bird’s-eye view map, ICRA 24)
Prior, I did research with Shanda AI Research Tokyo, Tencent AI Lab and NTU MMLab. Feel free to contact me for collaboration🤠.
📝 Recent Publications
* indicates equal contributions. For a complete list of publications, please refer to my Google Scholar profile.

VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models
Ziqi Huang*, Fan Zhang*, Xiaojie Xu, Yinan He, Jiashuo Yu, Ziyue Dong, Qianli Ma, Nattapol Chanpaisit, Chenyang Si, Yuming Jiang, Yaohui Wang, Xinyuan Chen, Ying-Cong Chen, Limin Wang, Dahua Lin, Yu Qiao, Ziwei Liu
IEEE Transactions on Pattern Analysis and Machine Intelligence(T-PAMI), Github stars > 1k

PreGenie: An Agentic Framework for High-quality Visual Presentation Generation
Xiaojie Xu, Xinli Xu, Sirui Chen, Haoyu Chen, Fan Zhang, Ying-Cong Chen
Conference on Empirical Methods in Natural Language Processing(EMNLP), Findings

Long-Video Audio Synthesis with Multi-Agent Collaboration
Yehang Zhang, Xinli Xu, Xiaojie Xu, Doudou Zhang, Li Liu, Ying-Cong Chen
Conference on Empirical Methods in Natural Language Processing(EMNLP), Main

POSTA: A Go-to Framework for Customized Artistic Poster Generation
Haoyu Chen*, Xiaojie Xu*, Wenbo Li, Jingjing Ren, Tian Ye, Songhua Liu, Ying-Cong Chen, Lei Zhu, Xinchao Wang
Conference on Computer Vision and Pattern Recognition(CVPR)

Xiaojie Xu, Tianshuo Xu, Fulong Ma and Ying-Cong Chen
International Conference on Robotics and Automation(ICRA)
📖 Education
Doctor of Philosophy in Information and Communication Engineering(Incoming), The University of Tokyo
Master of Philosophy in Artificial Intelligence, The Hong Kong University of Science and Technology, with Prof. Ying-Cong Chen
Bachelor of Engineering in Automation, University of Science and Technology of China, with Prof. Ligang Liu