About Me
I am currently a researcher at Huawei Hong Kong Research Center. I earned my Ph.D. in Data Science and Analytics from the Hong Kong University of Science and Technology (HKUST) in 2025, under the supervision of Prof. Wei Wang. Prior to my doctoral studies, I received an M.S. degree in Big Data and Technology from HKUST in 2021, advised by Prof. Fangzhen Lin. I completed my undergraduate education at Shanghai University, obtaining a B.S. degree in Mathematics and Applied Mathematics in 2020, where I was advised by Prof. Qingwen Wang.
🔥We are seeking motivated students to join us as interns! Our team offers access to GPU computing power and LLM APIs to support hands-on research, with a focus on transforming innovative ideas into real-world applications in LLM.
(This webpage was last updated on Jan 27, 2026)
Research Interests
- Current Topic:
- Topic in Ph.D.:
- Reinforcement Learning of Large Language Models, concentrating on iteratively refining model outputs through reward-based feedback to improve alignment and reasoning (ESO, DPO-BMC).
- Instruction Tuning of Large Language Models, especially on enhancing (Lion, LTE, WebR) and evaluating (FollowBench) the capability of language models to comprehend and execute complex instructions accurately.
- Contrastive Learning in NLP, focusing on leveraging contrastive learning to enhance the quality of embeddings (PromCSE, AMR-DA) and to enable more nuanced and context-aware language model performances (GOLF).
- Ph.D. Thesis: Towards Efficient and Effective Alignment of Large Language Models
News
- Jan. 2026: 📃Two papers about LLM Reinforcement Learning are accepted by ICLR 2026.
- Jan. 2026: 🔥We present SWE-Lego, the state-of-the-art supervised fine-tuning method for software issue resolving. All code, data, models are now opensourced. Project website.
Selected Publications (*: Equal Contribution, ^: Corresponding Author)
From Verifiable Dot to Reward Chain: Harnessing Verifiable Reference-based Rewards for Reinforcement Learning of Open-ended Generation
Yuxin Jiang, Yufei Wang, Qiyuan Zhang, Xingshan Zeng, Liangyou Li, Jierun Chen, Chaofan Tao, Haoli Bai, Lifeng Shang.
ICLR-2026 [pdf] RLVRRMemory-T1: Reinforcement Learning for Temporal Reasoning in Multi-session Agents
Yiming Du, Baojun Wang, Yifan Xiang, Zhaowei Wang, Wenyu Huang, Boyang Xue, Bin Liang, Xingshan Zeng, Fei Mi, Haoli Bai, Lifeng Shang, Jeff Z. Pan, Yuxin Jiang^, Kam-Fai Wong^.
ICLR-2026 [pdf] Memory-T1SWE-Lego: Pushing the Limits of Supervised Fine-tuning for Software Issue Resolving
Chaofan Tao*, Jierun Chen*, Yuxin Jiang*, Kaiqi Kou*, Shaowei Wang*, Ruoyu Wang*, Xiaohui Li, Sidi Yang, Yiming Du, Jianbo Dai, Zhiming Mao, Xinyu Wang, Lifeng Shang, Haoli Bai.
Technical Report-2026 [pdf] SWE-LegoInstruction-Tuning Data Synthesis from Scratch via Web Reconstruction
Yuxin Jiang, Yufei Wang, Chuhan Wu, Xinyi Dai, Yan Xu, Weinan Gan, Yasheng Wang, Xin Jiang, Lifeng Shang, Ruiming Tang, Wei Wang.
ACL Findings-2025 [pdf] [bibtex] WebRBridging and Modeling Correlations in Pairwise Data for Direct Preference Optimization
Yuxin Jiang, Bo Huang, Yufei Wang, Xingshan Zeng, Liangyou Li, Yasheng Wang, Xin Jiang, Lifeng Shang, Ruiming Tang, Wei Wang.
ICLR-2025 [pdf] [bibtex] BMCLearning to Edit: Aligning LLMs with Knowledge Editing
Yuxin Jiang, Yufei Wang, Chuhan Wu, Wanjun Zhong, Xingshan Zeng, Jiahui Gao, Liangyou Li, Xin Jiang, Lifeng Shang, Ruiming Tang, Qun Liu, Wei Wang.
ACL-2024 [pdf] [bibtex] LTEFollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models
Yuxin Jiang, Yufei Wang, Xingshan Zeng, Wanjun Zhong, Liangyou Li, Fei Mi, Lifeng Shang, Xin Jiang, Qun Liu, Wei Wang.
ACL-2024 [pdf] [bibtex] FollowBenchLion: Adversarial Distillation of Proprietary Large Language Models
Yuxin Jiang, Chunkit Chan*, Mingyang Chen*, Wei Wang.
EMNLP-2023 [pdf] [bibtex] LionGlobal and Local Hierarchy-aware Contrastive Framework for Implicit Discourse Relation Recognition
Yuxin Jiang, Linhan Zhang, Wei Wang.
ACL Findings-2023 [pdf] [bibtex] GOLF-for-IDRRImproved Universal Sentence Embeddings with Prompt-based Contrastive Learning and Energy-based Learning
Yuxin Jiang, Linhan Zhang, Wei Wang.
EMNLP Findings-2022 [pdf] [bibtex] PromCSE
More publications can be found [HERE].
Education
![]() | The Hong Kong University of Science and Technology Ph.D. in Individualized Interdisciplinary Program (Data Science and Analytics) Sep. 2021 -- Jul. 2025 |
![]() | The Hong Kong University of Science and Technology M.S. in Big Data and Technology Sep. 2020 -- Jul. 2021 |
![]() | Shanghai University B.S. in Mathematics and Applied Mathematics Sep. 2016 -- Jul. 2020 |
Awards
- [2025] Excellent Intern of Huawei Noah’s Ark Lab (Top 5%)
- [2023-2025] Research Travel Grant Award
- [2024] ACL 2024 Outstanding Paper Award (Top 1%)
- [2023] ICASSP 2023 Top 3% Paper Recognition (Top 1%)
- [2021] School of Engineering Excellent Student Scholarship at HKUST (Top 5%)
- [2020] Outstanding Graduates of Shanghai (Top 1%)
- [2016-2019] Grand Prize Scholarship, Leadership Scholarship, and Excellent Student at Shanghai University (Top 3%)
Talks
- 2023 December: EMNLP Conference on Large Language Models and the Future of NLP. Lion: Adversarial Distillation of Proprietary Large Language Models. [slides] [video]
Teaching Assistant
- INFH6780: Career Development for Information Hub Students. (Spring 2024)
- INFH5000: Information Science and Technology: Essentials and Trends. (Fall 2022)
Academic Service
- Conference Reviewer: EMNLP’22, 23, ACL’23, ACL Rolling Review’23, 24, 25, 26, ICLR’25, 26, AAAI’26.
- Conference External Reviewer: DASFAA’21, SIGIR’22, 23, ICDE’23, NeurIPS’23, 24.

