Hi, Iβm Run Luo. I am currently a third-year Master student in Computer Technology at the University of Chinese Academy of Sciences (UCAS), supervised by Prof. Min Yang. I received my Bachelorβs degree in Software Engineering from Huazhong University of Science and Technology (HUST) in 2023. My research interests focus on Visual Tracking, Diffusion Model, Multi-modal Learning, and Large Language Models. I am currently exploring a unified omnimodal foundational model involving vision, audio, and text modalities, and I hope to see the model generate synergy and benefit from both generation and understanding, thereby extending the intelligence boundaries of existing models. I firmly believe that it can unify the paradigms of world models or vision-language-action models, and through this, benefit interactions across different physical devices and the real world.
π₯ News
- 2025.09: Β ππ Two papers are accepted by NeurIPS 2025.
- 2025.05: Β ππ Two papers are accepted by ACL 2025.
- 2025.01: Β ππ One paper is accepted by ICLR 2025.
- 2024.11: Β ππ Two paper is accepted by AAAI 2024.
- 2024.07: Β ππ One paper is accepted by EMNLP 2024.
- 2024.04: Β ππ Join Tongyi Lab@Beijing for a internship.
- 2024.04: ππ Two papers are accepted by ACL 2024.
- 2023.11: ππ One paper is accepted by AAAI 2023.
- 2022.11: ππ One paper is accepted by AAAI 2022.
π Publications
* indicates equal contribution
- GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents. Arxiv
Run Luo, Lu Wang, Wanwei He, Longze Chen, Jiaming Li, Min Yang, Xiaobo Xia
[Paper] [Code] - OpenOmni: Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Real-Time Self-Aware Emotional Speech Synthesis. NeurIPS 2025.
Run Luo, Ting-En Lin, Haonan Zhang, Yuchuan Wu, Xiong Liu, Min Yang, Yongbin Li, Longze Chen, Jiaming Li, Lei Zhang, Xiaobo Xia, Hamid Alinejad-Rokny, Fei Huang
[Paper] [Code] - VCM: Vision Concept Modeling with Adaptive Vision Token Compression. NeurIPS 2025.
Run Luo, Renke Shan, Longze Chen, Ziqiang Liu, Lu Wang, Min Yang, Xiaobo Xia
[Paper] - OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction. ACL 2025.
Haonan Zhang*, Run Luo*, Xiong Liu, Yuchuan Wu, Ting-En Lin, Pengpeng Zeng, QIANG QU, Feiteng Fang, Min Yang, Lianli Gao, Jingkuan Song, Fei Huang, Yongbin Li
[Paper] [Code] - MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct. ACL 2025οΌFindingsοΌ.
Run Luo, Haonan Zhang, Longze Chen, Ting-En Lin, Xiong Liu, Yuchuan Wu, Min Yang, Minzheng Wang, Pengpeng Zeng, Lianli Gao, Heng Tao Shen, Yunshui Li, Xiaobo Xia, Fei Huang, Jingkuan Song, Yongbin Li
[Project Page] [Paper] [Code] - DEEM: Diffusion models serve as the eyes of large language models for image perception. ICLR 2025 Spotlight.
Run Luo, Yunshui Li, Longze Chen, Wanwei He, Ting-En Lin, Ziqiang Liu, Lei Zhang, Zikai Song, Xiaobo Xia, Tongliang Liu, Min Yang, Binyuan Hui
[Paper] [Code] - DiffusionTrack: Diffusion Model For Multi-Object Tracking. AAAI 2023.
Run Luo, Zikai Song, Lintao Ma, Jinlin Wei, Wei Yang, Min Yang
[Paper] [Code] - Compact Transformer Tracker with Correlative Masked Modeling. AAAI 2022.
Zikai Song*, Run Luo*, Junqing Yu, Chen Yi-Ping Phoebe, Wei Yang
[Paper] [Code]
π Honors and Scholarships
- 2025.10 National Scholarship.
π Educations
- 2023.09 - 2026.06, University of Chinese Academy of Sciences, Master of Computer Technology.
- 2019.09 - 2023.06, Huazhong University of Science and Technology, Bachelor of Software Engineering.
π» Internships
- 2024.04 - 2024.11, Tongyi Lab, Alibaba Group, China.
π¬ Services
Reviewer for CVPR23-25, ICCV25, ICLR25-26, ICML25, NIPS24-25, AAAI22-26, ACL24-25, EMNLP24-25, ACM MM 23-25 etc.
visitors since Oct. 2025