邮箱:wangchenglong@cse.neu.edu.cn
个人主页:https://wangclnlp.github.io/wangchenglong.github.io/
个人简介:
王成龙,2026年博士毕业于东北大学,师从肖桐教授与朱靖波教授。现为东北大学计算机科学与工程学院副教授。主要从事自然语言处理领域的研究,重点关注大语言模型对齐技术,包括奖励模型优化、高效强化学习、推理增强,以及多语言与多模态场景下的大语言模型对齐等方向。近年来在国际顶级会议或期刊发表学术论文20余篇,其中以第一作者发表13篇论文,包含7篇CCF A类论文、1篇SCI 1区论文,相关成果发表于NeurIPS、ICML、ACL、CVPR、AAAI等国际顶级会议。曾入选中国科协青年人才托举工程博士生专项计划,并担任ACL、EMNLP领域主席(Area Chair),长期担任NeurIPS、AAAI、ACL、COLING等国际顶级会议审稿人。曾获EMNLP 2024 Outstanding Reviewer、CCL 2025 Highlight Poster、CCMT 2025 Highlight Poster等荣誉,并作为核心成员参与 WMT2021 国际机器翻译评测任务,获得英俄翻译冠军与高效翻译冠军。
主要论著:
2026:
1. SERM: Self-Evolving Relevance Model with Agent-Driven Learning from Massive Query Streams ([https://arxiv.org/pdf/2601.09515](https://arxiv.org/pdf/2601.09515)). Chenglong Wang, Canjia Li, Xingzhao Zhu, Yifu Huo, Huiyu Wang, Weixiong Lin, Yun Yang, Qiaozhi He, Tianhua Zhou, Xiaojia Chang, Jingbo Zhu, Tong Xiao. ACL Findings, 2026
2. MSRL: Scaling Generative Multimodal Reward Modeling via Multi-Stage Reinforcement Learning ([https://arxiv.org/pdf/2603.25108](https://arxiv.org/pdf/2603.25108)). Chenglong Wang, Yifu Huo, Yang Gan, Qiaozhi He, Qi Meng, Bei Li, Yan Wang, Junfu Liu, Tianhua Zhou, Jingbo Zhu, Tong Xiao. CVPR, 2026
3. Probing Preference Representations: A Multi-Dimensional Evaluation and Analysis Method for Reward Models ([https://arxiv.org/pdf/2511.12464](https://www.google.com/search?q=https://arxiv.org/pdf/2511.12464)). Chenglong Wang, Yifu Huo, Yang Gan, Yongyu Mu, Qiaozhi He, Murun Yang, Bei Li, Chunliang Zhang, Tongran Liu, Anxiang Ma, Zhengtao Yu, Jingbo Zhu, Tong Xiao. AAAI, 2026
4. GRAM-R$^2$: Self-Training Generative Foundation Reward Models for Reward Reasoning ([https://arxiv.org/pdf/2509.02492](https://arxiv.org/pdf/2509.02492)). Chenglong Wang, Yongyu Mu, Hang Zhou, Yifu Huo, Ziming Zhu, Jiali Zeng, Murun Yang, Bei Li, Tong Xiao, Xiaoyang Hao, Chunliang Zhang, Fandong Meng, Jingbo Zhu. AAAI, 2026
2025:
1. GRAM: A Generative Foundation Reward Model for Reward Generalization ([https://www.arxiv.org/abs/2506.14175](https://www.google.com/search?q=https://www.arxiv.org/abs/2506.14175)). Chenglong Wang, Yang Gan, Yifu Huo, Yongyu Mu, Qiaozhi He, Murun Yang, Bei Li, Tong Xiao, Chunliang Zhang, Tongran Liu, Jingbo Zhu. ICML, 2025
2. RoVRM: A Robust Visual Reward Model Optimized via Auxiliary Textual Preference Data ([https://www.arxiv.org/abs/2408.12109](https://www.google.com/search?q=https://www.arxiv.org/abs/2408.12109)). Chenglong Wang, Yang Gan, Yifu Huo, Yongyu Mu, Murun Yang, Qiaozhi He, Tong Xiao, Chunliang Zhang, Tongran Liu, Quan Du, Di Yang, Jingbo Zhu. AAAI, 2025
3. Learning Evaluation Models from Large Language Models for Sequence Generation ([https://arxiv.org/abs/2308.04386](https://arxiv.org/abs/2308.04386)). Chenglong Wang, Hang Zhou, Kaiyan Chang, Tongran Liu, Chunliang Zhang, Quan Du, Tong Xiao, Jingbo Zhu. TASLP, 2025
2024:
1. Revealing the Parallel Multilingual Learning within Large Language Models ([https://arxiv.org/abs/2403.09073](https://arxiv.org/abs/2403.09073)). Yongyu Mu, Peinan Feng, Zhiquan Cao, Yuzhang Wu, Bei Li, Chenglong Wang, Tong Xiao, Kai Song, Tongran Liu, Chunliang Zhang, Jingbo Zhu. EMNLP, 2024
2. Prior Constraints-based Reward Model Training for Aligning Large Language Models ([https://arxiv.org/abs/2404.00978](https://arxiv.org/abs/2404.00978)). Hang Zhou, Chenglong Wang, Yimin Hu, Tong Xiao, Chunliang Zhang, Jingbo Zhu. CCL, 2024
3. Hybrid Alignment Training for Large Language Models ([https://arxiv.org/abs/2406.15178](https://arxiv.org/abs/2406.15178)). Chenglong Wang, Hang Zhou, Kaiyan Chang, Bei Li, Yongyu Mu, Tong Xiao, Tongran Liu, JingBo Zhu. ACL Findings, 2024
4. Efficient Prompting Methods for Large Language Models: A Survey ([https://arxiv.org/abs/2404.01077](https://arxiv.org/abs/2404.01077)). Kaiyan Chang, Songcheng Xu, Chenglong Wang, Yingfeng Luo, Tong Xiao, Jingbo Zhu. arXiv, 2024
5. ESRL: Efficient Sampling-based Reinforcement Learning for Sequence Generation ([https://arxiv.org/pdf/2308.02223.pdf](https://www.google.com/search?q=https://arxiv.org/pdf/2308.02223.pdf)). Chenglong Wang, Hang Zhou, Yimin Hu, Yifu Huo, Bei Li, Tongran Liu, Tong Xiao, Jingbo Zhu. AAAI, 2024
2022:
1. Improved Knowledge Distillation for Pre-trained Language Models via Knowledge Selection ([https://arxiv.org/abs/2302.00444](https://arxiv.org/abs/2302.00444)). Chenglong Wang, Yi Lu, Yongyu Mu, Yimin Hu, Tong Xiao and Jingbo Zhu. EMNLP Findings, 2022
2021:
1. The NiuTrans System for the WMT21 Efficiency Task ([https://arxiv.org/abs/2109.08003](https://arxiv.org/abs/2109.08003)). Chenglong Wang, Chi Hu, Yongyu Mu, Zhongxiang Yan, Siming Wu, Minyi Hu, Hang Cao, Bei Li, Ye Lin, Tong Xiao, Jingbo Zhu. EMNLP Workshop, 2021
2. RankNAS: Efficient Neural Architecture Search by Pairwise Ranking ([https://arxiv.org/abs/2109.07383](https://arxiv.org/abs/2109.07383)). Chi Hu, Chenglong Wang, Xiangnan Ma, Xia Meng, Yinqiao Li, Tong Xiao, Jingbo Zhu and Changliang Li. EMNLP, 2021
2020:
1. The NiuTrans System for WNGT 2020 Efficiency Task ([https://arxiv.org/abs/2109.08008](https://arxiv.org/abs/2109.08008)). Chi Hu, Bei Li, Yinqiao Li, Ye Lin, Yanyang Li, Chenglong Wang, Tong Xiao, Jingbo Zhu. EMNLP Workshop, 2021
