Hi, this is Xiaojie Xu(徐 啸捷). I am currently an M.Phil. student in Artificial Intelligence at The Hong Kong University of Science and Technology, Guangzhou advised by Prof. Ying-Cong Chen. Prior, I received a Bachelor’s degree in Automation from University of Science and Technology of China, advised by Prof. Ligang Liu.

My research focuses on Multimodal Understanding and Generation, including text, images, videos, and 3D data.

I am always open to interesting research topics. Please feel free to contact me if you want to collaborate🤠.

📝 Publications

* indicates equal contributions. For a complete list of publications, please refer to my Google Scholar profile.

Arxiv
sym

VBench++: Comprehensive and Versatile Benchmark Suite for Video Generative Models

Ziqi Huang*, Fan Zhang*, Xiaojie Xu, Yinan He, Jiashuo Yu, Ziyue Dong, Qianli Ma, Nattapol Chanpaisit, Chenyang Si, Yuming Jiang, Yaohui Wang, Xinyuan Chen, Ying-Cong Chen, Limin Wang, Dahua Lin, Yu Qiao, Ziwei Liu

Submitted to some journal, Github stars > 1k

EMNLP 2025, Findings
sym

PreGenie: An Agentic Framework for High-quality Visual Presentation Generation

Xiaojie Xu, Xinli Xu, Sirui Chen, Haoyu Chen, Fan Zhang, Ying-Cong Chen

Conference on Empirical Methods in Natural Language Processing(EMNLP), Findings

EMNLP 2025, Main
sym

Long-Video Audio Synthesis with Multi-Agent Collaboration

Yehang Zhang*, Xinli Xu*, Xiaojie Xu*, Doudou Zhang, Li Liu, Ying-Cong Chen

Conference on Empirical Methods in Natural Language Processing(EMNLP), Main

CVPR 2025
sym

POSTA: A Go-to Framework for Customized Artistic Poster Generation

Haoyu Chen*, Xiaojie Xu*, Wenbo Li, Jingjing Ren, Tian Ye, Songhua Liu, Ying-Cong Chen, Lei Zhu, Xinchao Wang

Conference on Computer Vision and Pattern Recognition(CVPR)

ECCV 2024, Oral
sym

Momentum Auxiliary Network for Supervised Local Learning

Junhao Su, Changpeng Cai, Feiyu Zhu, Chenghao He, Xiaojie Xu, Dongzhi Guan, Chenyang Si

European Conference on Computer Vision(ECCV), Oral Presentation, Top 2.3%

ICRA 2023
sym

From Bird’s-Eye to Street View: Crafting Diverse and Condition-Aligned Images with Latent Diffusion Model

Xiaojie Xu, Tianshuo Xu, Fulong Ma and Ying-Cong Chen

International Conference on Robotics and Automation(ICRA)

CVPR 2021
sym

3DCaricShop: A Dataset and A Baseline Method for Single-view 3D Caricature Face Reconstruction

Yuda Qiu, Xiaojie Xu, Lingteng Qiu, Yan Pan, Yushuang Wu, Weikai Chen, and Xiaoguang Han

Conference on Computer Vision and Pattern Recognition(CVPR)

📖 Education

💻 Research Experiences

🎖 Honors and Awards

  • Postgraduate Scholarship Award at HKUST
  • Outstanding Undergraduate Scholarship Award at USTC
  • Chinese Physics Olympiad(CPhO). First prize in Jiangxi Province