刘正皓

个人信息Personal Information

副教授

教师拼音名称:liuzhenghao

出生日期:1994-10-22

电子邮箱:

入职时间:2021-07-12

所在单位:Dept. of Computer Science and Technology

职务:副教授

学历:博士研究生毕业

办公地点:信息学馆B233,浑南校区。

学位:工学博士学位

在职信息:在职

主要任职:清华大学自然语言处理实验室客座研究员

其他任职:东北大学计划财经处副处长(挂职)

毕业院校:清华大学

论文成果

当前位置: 中文主页 >> 科学研究 >> 论文成果

Publications [Google Scholar]

* indicates equal contribution.

# indicates corresponding author.

    2024

    • Zhipeng Xu, Zhenghao Liu#, Yukun Yan, Zhiyuan Liu, Ge Yu, Chenyan Xiong. Cleaner Pretraining Corpus Curation with Neural Web Scraping. ACL 2024. [pdf][codes].

    • Tianshuo Zhou, Sen Mei, Xinze Li, Zhenghao Liu#, Chenyan Xiong, Zhiyuan Liu, Yu Gu, Ge Yu. MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Module Plugin. ACL 2024. [pdf][codes].

    • Hanbin Wang, Zhenghao Liu#, Shuo Wang, Ganqu Cui, Ning Ding, Zhiyuan Liu, Ge Yu. INTERVENOR: Prompting the Coding Ability of Large Language Models with the Interactive Chain of Repair. ACL 2024: Findings. [pdf][codes].

    • Haoyu Wang, Shuo Wang, Yukun Yan, Xujia Wang, Zhiyu Yang, Yuzhuang Xu, Zhenghao Liu, Liner Yang, Ning Ding, Xu Han, Zhiyuan Liu, Maosong Sun. UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset. ACL 2024. [pdf][codes].

    • Zhiyu Yang, Zihan Zhou, Shuo Wang, Xin Cong, Xu Han, Yukun Yan, Zhenghao Liu, Zhixing Tan, Pengyuan Liu, Dong Yu, Zhiyuan Liu, Xiaodong Shi, Maosong Sun. MatPlotAgent: Method and Evaluation for LLM-Based Agentic Scientific Data Visualization. ACL 2024: Findings. [pdf][codes].

    • Zhenghao Liu, Zulong Chen*, Moufeng Zhang*, Shaoyang Duan, Hong Wen, Liangyue Li, Nan Li, Yu Gu, Ge Yu. Modeling User Viewing Flow Using Large Language Models for Article Recommendation. WebConf 2024. [pdf].

    • Shi Yu, Chenghao Fan, Chenyan Xiong, David Jin, Zhiyuan Liu, Zhenghao Liu#. Fusion-in-T5: Unifying Variant Signals for Simple and Effective Document Ranking with Attention Fusion. The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (COLING 2024). [pdf][codes].

    • Ruining Chong, Luming Lu, Liner Yang, Jinran Nie, Zhenghao Liu, Shuo Wang, Shuhan Zhou, Yaoxin Li, Erhong Yang. MCTS: A Multi-Reference Chinese Text Simplification Dataset. The 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (COLING 2024). [pdf][codes].

    • Cheng Qian, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu. Toolink: Linking toolkit creation and using through chain-of-solving on open-source model. 2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024). [pdf][codes].

    • Yumeng Song, Yu Gu, Tianyi Li, Jianzhong Qi, Zhenghao Liu, Christian S Jensen, Ge Yu. CHGNN: A Semi-Supervised Contrastive Hypergraph Learning Network. IEEE Transactions on Knowledge and Data Engineering (TKDE). [pdf][code].

    • Yuqing Lan, Zhenghao Liu#, Yu Gu, Xiaoyuan Yi, Xiaohua Li, Liner Yang, Ge Yu. Multi-Evidence based Fact Verification via A Confidential Graph Neural Network. IEEE Transactions on Big Data (TBD). [pdf][code].

2023

  • Zhenghao Liu, Chenyan Xiong, Yuanhuiyi Lv, Zhiyuan Liu, Ge Yu.  Universal Multi-Modal Retrieval: Learning A Unified Representation Space for Vision Language Retrieval. The Eleventh International Conference on Learning Representations (ICLR 2023). [pdf][codes].

  • Zhenghao Liu*#, Sen Mei, Chenyan Xiong, Xiaohua Li, Shi Yu, Zhiyuan Liu, Yu Gu, Ge Yu.  Text Matching Improves Sequential Recommendation by Reducing Popularity Biases. The 32nd ACM International Conference on Information and Knowledge Management (CIKM 2023). [pdf][codes].

  • Shi Yu, Zhenghao Liu, Chenyan Xiong, Zhiyuan Liu. OpenMatch-v2: An All-in-one Multi-Modality PLM-based Information Retrieval Toolkit. The 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2023). [pdf][codes].

  • Xinze Li, Zhenghao Liu#, Chenyan Xiong, Shi Yu, Yu Gu, Zhiyuan Liu, Ge Yu. Structure-Aware Language Model Pretraining Improves Dense Retrieval on Structured Data. Findings of the Association for Computational Linguistics: ACL 2023 (ACL 2023). [pdf][codes].

  • Ruining Chong, Cunliang Kong, Liu Wu, Zhenghao Liu, Ziye Jin, Liner Yang, Yange Fan, Hanghang Fan, Erhong Yang. Leveraging Prefix Transfer for Multi-Intent Text Revision. The 61st Annual Meeting of the Association for Computational Linguistics (ACL 2023). [pdf].

    2022

  • Zhenghao Liu, Han Zhang, Chenyan Xiong, Zhiyuan Liu, Yu Gu, Xiaohua Li. Dimension Reduction for Efficient Dense Retrieval via Conditional Autoencoder. The 2022 Conference on Empirical Methods in Natural Language Processing (EMNLP 2022). [pdf][codes].

  • Xiaomeng Hu, Shi Yu, Chenyan Xiong, Zhenghao Liu#, Zhiyuan Liu, Ge Yu. P3 Ranker: Mitigating the Gaps between Pre-training and Ranking Fine-tuning with Prompt-based Learning and Pre-finetuning. The 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2022). [pdf][codes].

    2021

  • Zhenghao Liu, Xiaoyuan Yi, Maosong Sun, Liner Yang, Tat-Seng Chua. Neural Quality Estimation with Multiple Hypotheses for Grammatical Error Correction. The 2021 Conference of the North American Chapter of the Association for Computational Linguistics - Human Language Technologies (NAACL-HLT 2021). [pdf][codes].

  • Zhenghao Liu*,  Kaitao Zhang*, Chenyan Xiong, Zhiyuan Liu, Maosong Sun. OpenMatch: An Open Source Library for Neu-IR Research. The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021). [pdf][codes].

  • Shi Yu*, Zhenghao Liu*, Chenyan Xiong, Tao Feng, Zhiyuan Liu. Few-Shot Conversational Dense Retrieval. The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2021). [pdf][codes].

  • Yizhi Li*, Zhenghao Liu*, Chenyan Xiong, Zhiyuan Liu. More Robust Dense Retrieval with Contrastive Dual Learning. The 2021 ACM SIGIR International Conference on the Theory of Information Retrieval (ICTIR 2021). [pdf][codes].

  • Si Sun*, Zhenghao Liu*, Chenyan Xiong, Zhiyuan Liu and Jie Bao. Capturing Global Informativeness in Open Domain Keyphrase Extraction. The CCF Conference on Natural Language Processing and Chinese Computing (NLPCC 2021). [pdf][codes].

  • Si Sun, Yingzhuo Qian, Zhenghao Liu, Chenyan Xiong, Kaitao Zhang, Jie Bao, Zhiyuan Liu, Paul Bennett. Few-Shot Text Ranking with Meta Adapted Synthetic Weak Supervision. The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (ACL-IJCNLP 2021). [pdf][codes].

  • Huiyuan Xie, Zhenghao Liu, Chenyan Xiong, Zhiyuan Liu and Ann Copestake . TIAGE: A Benchmark for Topic-Shift Aware Dialog Modeling. Findings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP 2021). [pdf][codes]

    2020

  • Zhenghao Liu, Chenyan Xiong, Maosong Sun, Zhiyuan Liu. Fine-grained Fact Verification with Kernel Graph Attention Network. The 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020). [pdf][codes].

  • Zhenghao Liu, Chenyan Xiong, Zhuyun Dai, Si Sun, Maosong Sun, Zhiyuan Liu. Adapting Open Domain Fact Extraction and Verification to COVID-FACT through In-Domain Language Modeling. Findings of the Association for Computational Linguistics: EMNLP 2020 (EMNLP 2020). [pdf][codes].

  • Houyu Zhang*, Zhenghao Liu*, Chenyan Xiong, Zhiyuan Liu. Grounded Conversation Generation as Guided Traverses in Commonsense Knowledge Graphs. The 58th Annual Meeting of the Association for Computational Linguistics (ACL 2020). [pdf][codes].

  • Chenyan Xiong*, Zhenghao Liu*, Si Sun*, Zhuyun Dai*, Kaitao Zhang*, Shi Yu*, Zhiyuan Liu, Hoifung Poon, Jianfeng Gao, Paul Bennett. CMT in TREC-COVID Round 2: Mitigating the Generalization Gaps from Web to Special Domain Search. [pdf][codes].

  • Xiaoyuan Yi, Zhenghao Liu, Wenhao Li, Maosong Sun. 2020. Text Style Transfer via Learning Style Instance Supported Latent Space. The 28th International Joint Conference on Artificial Intelligence (IJCAI 2019). [pdf].

  • Kaitao Zhang, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu. Selective Weak Supervision for Neural Information Retrieval. The Web Conference 2020 (WebConf 2020). [pdf][codes].

  • Deming Ye, Yankai Lin, Jiaju Du, Zhenghao Liu, Peng Li, Maosong Sun, Zhiyuan Liu. Coreferential Reasoning Learning for Language Representation. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP 2020). [pdf][codes].

    2019

  • Zhenghao Liu, Chenyan Xiong, Maosong Sun, Zhiyuan Liu. Explore Entity Embedding Effectiveness in Entity Retrieval. The 18th China National Conference on Computational Linguistics (CCL 2019).[pdf][codes].

  • Yifan Qiao, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu. Understanding the Behaviors of BERT in Ranking. arXiv preprint arXiv:1904.07531.[pdf].

  • Yuan Yao, Deming Ye, Peng Li, Xu Han, Yankai Lin, Zhenghao Liu, Zhiyuan Liu, Lixin Huang, Jie Zhou, Maosong Sun. DocRED: A Large-Scale Document-Level Relation Extraction Dataset. The 57th Annual Meeting of the Association for Computational Linguistics (ACL 2019).[pdf][codes].

    2018

  • Zhenghao Liu, Chenyan Xiong, Maosong Sun, Zhiyuan Liu. Entity-Duet Neural Ranking: Understanding the Role of Knowledge Graph Semantics in Neural Information Retrieval. The 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018).[pdf][codes].

    2017

  • Liner Yang, Maosong Sun, Jiacheng Zhang, Zhenghao Liu, Huanbo Luan, Yang Liu. Neural Parse Combination. Journal of Computer Science and Technology, 2017.[pdf].