Hi, I’m Run Luo. I am currently a third-year Master student in Computer Technology at the University of Chinese Academy of Sciences (UCAS), supervised by Prof. Min Yang. I received my Bachelor’s degree in Software Engineering from Huazhong University of Science and Technology (HUST) in 2023. My research interests focus on Visual Tracking, Diffusion Model, Multi-modal Learning, and Large Language Models. I am currently exploring a unified omnimodal foundational model involving vision, audio, and text modalities, and I hope to see the model generate synergy and benefit from both generation and understanding, thereby extending the intelligence boundaries of existing models. I firmly believe that it can unify the paradigms of world models or vision-language-action models, and through this, benefit interactions across different physical devices and the real world.

πŸ”₯ News

  • 2025.09: Β πŸŽ‰πŸŽ‰ Two papers are accepted by NeurIPS 2025.
  • 2025.05: Β πŸŽ‰πŸŽ‰ Two papers are accepted by ACL 2025.
  • 2025.01: Β πŸŽ‰πŸŽ‰ One paper is accepted by ICLR 2025.
  • 2024.11: Β πŸŽ‰πŸŽ‰ Two paper is accepted by AAAI 2024.
  • 2024.07: Β πŸŽ‰πŸŽ‰ One paper is accepted by EMNLP 2024.
  • 2024.04: Β πŸŽ‰πŸŽ‰ Join Tongyi Lab@Beijing for a internship.
  • 2024.04: πŸŽ‰πŸŽ‰ Two papers are accepted by ACL 2024.
  • 2023.11: πŸŽ‰πŸŽ‰ One paper is accepted by AAAI 2023.
  • 2022.11: πŸŽ‰πŸŽ‰ One paper is accepted by AAAI 2022.

πŸ“ Publications

* indicates equal contribution

  • GUI-R1 : A Generalist R1-Style Vision-Language Action Model For GUI Agents. Arxiv
    Run Luo, Lu Wang, Wanwei He, Longze Chen, Jiaming Li, Min Yang, Xiaobo Xia
    [Paper] [Code]
  • OpenOmni: Advancing Open-Source Omnimodal Large Language Models with Progressive Multimodal Alignment and Real-Time Self-Aware Emotional Speech Synthesis. NeurIPS 2025.
    Run Luo, Ting-En Lin, Haonan Zhang, Yuchuan Wu, Xiong Liu, Min Yang, Yongbin Li, Longze Chen, Jiaming Li, Lei Zhang, Xiaobo Xia, Hamid Alinejad-Rokny, Fei Huang
    [Paper] [Code]
  • VCM: Vision Concept Modeling with Adaptive Vision Token Compression. NeurIPS 2025.
    Run Luo, Renke Shan, Longze Chen, Ziqiang Liu, Lu Wang, Min Yang, Xiaobo Xia
    [Paper]
  • OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction. ACL 2025.
    Haonan Zhang*, Run Luo*, Xiong Liu, Yuchuan Wu, Ting-En Lin, Pengpeng Zeng, QIANG QU, Feiteng Fang, Min Yang, Lianli Gao, Jingkuan Song, Fei Huang, Yongbin Li
    [Paper] [Code]
  • MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct. ACL 2025(FindingsοΌ‰.
    Run Luo, Haonan Zhang, Longze Chen, Ting-En Lin, Xiong Liu, Yuchuan Wu, Min Yang, Minzheng Wang, Pengpeng Zeng, Lianli Gao, Heng Tao Shen, Yunshui Li, Xiaobo Xia, Fei Huang, Jingkuan Song, Yongbin Li
    [Project Page] [Paper] [Code]
  • DEEM: Diffusion models serve as the eyes of large language models for image perception. ICLR 2025 Spotlight.
    Run Luo, Yunshui Li, Longze Chen, Wanwei He, Ting-En Lin, Ziqiang Liu, Lei Zhang, Zikai Song, Xiaobo Xia, Tongliang Liu, Min Yang, Binyuan Hui
    [Paper] [Code]
  • DiffusionTrack: Diffusion Model For Multi-Object Tracking. AAAI 2023.
    Run Luo, Zikai Song, Lintao Ma, Jinlin Wei, Wei Yang, Min Yang
    [Paper] [Code]
  • Compact Transformer Tracker with Correlative Masked Modeling. AAAI 2022.
    Zikai Song*, Run Luo*, Junqing Yu, Chen Yi-Ping Phoebe, Wei Yang
    [Paper] [Code]

πŸŽ– Honors and Scholarships

  • 2025.10 National Scholarship.

πŸ“– Educations

  • 2023.09 - 2026.06, University of Chinese Academy of Sciences, Master of Computer Technology.
  • 2019.09 - 2023.06, Huazhong University of Science and Technology, Bachelor of Software Engineering.

πŸ’» Internships

  • 2024.04 - 2024.11, Tongyi Lab, Alibaba Group, China.

πŸ’¬ Services

Reviewer for CVPR23-25, ICCV25, ICLR25-26, ICML25, NIPS24-25, AAAI22-26, ACL24-25, EMNLP24-25, ACM MM 23-25 etc.

Free Web Counters

visitors since Oct. 2025