Jinyi Han’s HomePage

About me

I am Jinyi Han (韩槿一), formerly known as Haixia Han, a second-year Ph.D. candidate at East China Normal University, Shanghai, China. I am advised by Prof Yanghua Xiao and Dr. Jiaqing Liang at Knowledge Works Lab of Fudan University. My current research focus on natural language processing, with an emphasis on Large Language Models (LLMs) reasoning and deep thinking capabilities of LLMs , including self-correction, self-refinement, self-reflection and self-verification.

Jinyi Han’s CV (Chinese)

News

2025-05: Our work CDS: Data Synthesis Method Guided by Cognitive Diagnosis Theory has been accepted in ACL 2025.
2025-04: We have achieved autonomous tool learning for LLMs based on Reinforcement Learning (RL). Our work has been reported in jiqizhixin[Main Contributor]
2025-02: We have implemented the GRPO algorithm and released it on GitHub. Our work has been reported in jiqizhixin. [Main Contributor]
2025-01: Our work Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models has been accepted in ICLR 2025.
2023-12: Our work Small Language Model Can Self-correct has been accepted in AAAI 2024.

Education

Ph. D. in East China Normal University, Oct 2023 - Jun 2027 (expected)
M.S. in Donghua University, Oct 2020 – Mar 2023
B.S. in Henan University of Economics and Law, Oct 2016 – Jun 2020

Publications

[*: Supervisor as first author, student as second author; ❤: corresponding author] ❤

Small language model can self-correct. Haixia Han, Jiaqing Liang, Jie Shi, Qianyu He, Yanghua Xiao. 2024, AAAI.
Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models. Chengyu Du, Jinyi Han, Yizhou Ying, Aili Chen, Qianyu He, Haokun Zhao, Haoran Guo, Sirui Xia, Jiaqing Liang, Zulong Chen, Liangyue Li, Yanghua Xiao. 2025, ICLR.
CDS: Data Synthesis Method Guided by Cognitive Diagnosis Theory. Haokun Zhao, Jinyi Han, Jiaqing Liang, Xiaojun Meng, Jiansheng Wei, Yanghua Xiao. 2025, ACL.
Dynamic clustering based contextual combinatorial multi-armed bandit for online recommendation. Cairong Yan, Haixia Han^*, Yanting Zhang, Dandan Zhu, Yongquan Wan. 2022, Knowledge-Based Systems.
CoCoB: Adaptive Collaborative Combinatorial Bandits for Online Recommendation. Cairong Yan, Jinyi Han^❤, Jin Ju, Yanting Zhang, Zijian Wang, Xuan Shao. 2025, Dasfaa.
Thompson Sampling with Time-Varying Reward for Contextual Bandits. Cairong Yan, Hualu Xu, Haixia Han, Yanting Zhang, Zijian Wang. 2023, Dasfaa.

Projects

Continuous Evolution and Knowledge Update for LLMs via Cognitive Diagnosis
Huawei | Sep 2024 – Feb 2025 | Contributor
Code Intelligence Technology Solution Based on Large Language Models
Huawei | Nov 2023 – Nov 2024 | Contributor
Data Science in Large Language Models
Huawei Noah’s Ark Lab | Dec 2024 – Present | Contributor
Continuous Improvement Technologies for LLMs
ECNU Academic Innovation Promotion Program for Excellent Doctoral Students | Oct 2024 – Present | Principal Investigator
Research on Key Technologies for Intelligent Textbook Question Answering Based on Large Language Models
East China Normal University | Dec 2024 – Present | Principal Investigator
Training KW-CuteGPT
Knowledge Works Lab | Apr 2023 – Aug 2023 | Contributor

Awards

Feb 2023：Outstanding Master’s Degree Thesis at Donghua University （10%）
Mar 2023：Outstanding Graduate of Shanghai （1%）
Otc 2020- Otc 2022: Academic Scholarship First Prize（1%）
Jul 2021: Outstanding Bachelor’s Degree Thesis (Graduation Design) in Henan Province （1%）