Hi, I’m Run Luo. I am currently a third-year Master student in Computer Technology at the University of Chinese Academy of Sciences (UCAS), supervised by Prof. Min Yang. I received my Bachelor’s degree in Software Engineering from Huazhong University of Science and Technology (HUST) in 2023. My research interests focus on Visual Tracking, Diffusion Model, Multi-modal Learning, and Large Language Models. I am currently exploring a unified omnimodal foundational model involving vision, audio, and text modalities, and I hope to see the model generate synergy and benefit from both generation and understanding, thereby extending the intelligence boundaries of existing models. I firmly believe that it can unify the paradigms of world models or vision-language-action models, and through this, benefit interactions across different physical devices and the real world.

🔥 News

2026.04: 🎉🎉 Two papers are accepted by ACL 2026.
2026.01: 🎉🎉 Two papers are accepted by ICLR 2026.
2025.09: 🎉🎉 Two papers are accepted by NeurIPS 2025.
2025.05: 🎉🎉 Two papers are accepted by ACL 2025.
2025.01: 🎉🎉 One paper is accepted by ICLR 2025.
2024.11: 🎉🎉 Two paper is accepted by AAAI 2024.
2024.07: 🎉🎉 One paper is accepted by EMNLP 2024.
2024.04: 🎉🎉 Join Tongyi Lab@Beijing for a internship.
2024.04: 🎉🎉 Two papers are accepted by ACL 2024.
2023.11: 🎉🎉 One paper is accepted by AAAI 2023.
2022.11: 🎉🎉 One paper is accepted by AAAI 2022.

📝 Selected Publications

* indicates equal contribution

NExT-OMNI: Towards Any-to-Any Omnimodal Foundation Models with Discrete Flow Matching. ICLR 2026.
Run Luo, Xiaobo Xia, Lu Wang, Longze Chen, Renke Shan, Jing Luo, Min Yang, Tat-Seng Chua
[Paper] [Code (coming soon)]
GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents. Arxiv
Run Luo, Lu Wang, Wanwei He, Longze Chen, Jiaming Li, Min Yang, Xiaobo Xia
[Paper] [Code]
OpenOmni: Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Real-Time Self-Aware Emotional Speech Synthesis. NeurIPS 2025.
Run Luo, Ting-En Lin, Haonan Zhang, Yuchuan Wu, Xiong Liu, Min Yang, Yongbin Li, Longze Chen, Jiaming Li, Lei Zhang, Xiaobo Xia, Hamid Alinejad-Rokny, Fei Huang
[Paper] [Code]
VCM: Vision Concept Modeling with Adaptive Vision Token Compression. NeurIPS 2025.
Run Luo, Renke Shan, Longze Chen, Ziqiang Liu, Lu Wang, Min Yang, Xiaobo Xia
[Paper]
OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction. ACL 2025.
Haonan Zhang*, Run Luo*, Xiong Liu, Yuchuan Wu, Ting-En Lin, Pengpeng Zeng, QIANG QU, Feiteng Fang, Min Yang, Lianli Gao, Jingkuan Song, Fei Huang, Yongbin Li
[Paper] [Code]
MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct. ACL 2025（Findings）.
Run Luo, Haonan Zhang, Longze Chen, Ting-En Lin, Xiong Liu, Yuchuan Wu, Min Yang, Minzheng Wang, Pengpeng Zeng, Lianli Gao, Heng Tao Shen, Yunshui Li, Xiaobo Xia, Fei Huang, Jingkuan Song, Yongbin Li
[Project Page] [Paper] [Code]
DEEM: Diffusion models serve as the eyes of large language models for image perception. ICLR 2025 Spotlight.
Run Luo, Yunshui Li, Longze Chen, Wanwei He, Ting-En Lin, Ziqiang Liu, Lei Zhang, Zikai Song, Xiaobo Xia, Tongliang Liu, Min Yang, Binyuan Hui
[Paper] [Code]
DiffusionTrack: Diffusion Model For Multi-Object Tracking. AAAI 2023.
Run Luo, Zikai Song, Lintao Ma, Jinlin Wei, Wei Yang, Min Yang
[Paper] [Code]
Compact Transformer Tracker with Correlative Masked Modeling. AAAI 2022.
Zikai Song*, Run Luo*, Junqing Yu, Chen Yi-Ping Phoebe, Wei Yang
[Paper] [Code]

🎖 Honors and Scholarships

2025.10 National Scholarship.

📖 Educations

2023.09 - 2026.06, University of Chinese Academy of Sciences, Master of Computer Technology.
2019.09 - 2023.06, Huazhong University of Science and Technology, Bachelor of Software Engineering.

💻 Internships

2024.04 - 2024.11, Tongyi Lab, Alibaba Group, China.

💬 Services

Reviewer for CVPR23-25, ICCV25, ICLR25-26, ICML25, NIPS24-25, AAAI22-26, ACL24-25, EMNLP24-25, ACM MM 23-25 etc.

visitors since Oct. 2025