Jinyi Han’s HomePage

About Me

I am Jinyi Han (韩槿一), formerly known as Haixia Han, a third-year Ph.D. candidate at East China Normal University, Shanghai, China. I am advised by Prof Yanghua Xiao and Dr. Jiaqing Liang at Knowledge Works Lab of Fudan University.

My current research focuses on natural language processing, with a particular emphasis on enhancing the deep reasoning and cognitive capabilities of Large Language Models (LLMs). This includes:

  • Self-correction, self-refinement, self-reflection, and self-verification
  • Optimizing reasoning efficiency through test-time scaling
  • Exploring Agentic LLMs with tool usage, planning, and continual self-improvement

CV (Chinese)

PS: Updated as of March 2026

News

  • 2026-02: Our three papers have been accepted by ICLR 2026. Looking forward to seeing everyone in Rio de Janeiro!
  • 2025-05: Our work CDS: Data Synthesis Method Guided by Cognitive Diagnosis Theory has been accepted in ACL 2025.
  • 2025-04: We have achieved autonomous tool learning for LLMs based on Reinforcement Learning (RL). Our work has been reported in jiqizhixin Main Contributor
  • 2025-02: We have implemented the GRPO algorithm and released it on GitHub. Our work has been reported in jiqizhixin. Main Contributor
  • 2025-01: Our work Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models has been accepted in ICLR 2025.
  • 2023-12: Our work Small Language Model Can Self-correct has been accepted in AAAI 2024.

Education

Ph.D. in East China Normal University

Advisor: Prof. Yanghua Xiao & Dr. Jiaqing Liang

Oct 2023 - Jun 2027

M.S. in Donghua University

Advisor: Prof. Cairong Yan

Oct 2020 – Mar 2023

B.S. in Henan University of Economics and Law

Computer Science

Oct 2016 – Jun 2020

Internship Experience

ByteDance Jul 2025 – Present

China Transaction & Ads · Multimodal LLM Algorithm Engineer

Jindouyun Talent Program


Publications

[*: Supervisor as first author, student as second author; ❤: corresponding author​]

  1. A Stitch in Time Saves Nine: Proactive Self-Refinement for Language Models
    Jinyi Han, Xinyi Wang, Haiquan Zhao, Tingyun Li, Zishang Jiang, Sihang Jiang, Jiaqing Liang, Xin Alex Lin, Weikang Zhou, Zeye Sun, Fei Yu, Yanghua Xiao 2026, ICLR

  2. Your Models Have Thought Enough: Training Large Reasoning Models to Stop Overthinking
    Jinyi Han, Ying Huang, Ying Liao, Haiquan Zhao, Zishang Jiang, Xinyi Wang, Xikun Lu, Guanghao Zhou, Sihang Jiang, Jiaqing Liang, Weikang Zhou, Zeye Sun, Fei Yu, Yanghua Xiao 2026, ICLR

  3. Selective Expert Guidance for Effective and Diverse Exploration in Reinforcement Learning of LLMs
    Zishang Jiang, Jinyi Han, Tingyun Li, Xinyi Wang, Sihang Jiang, Zhaoqian Dai, Ma Shuguang, Fei Yu, Jiaqing Liang, Yanghua Xiao 2026, ICLR

  4. Difficulty Is Not Enough: Curriculum Learning for LLMs Fine-tuning Must Consider Utility
    Zishang Jiang, Jinyi Han, Tingyun Li, Xinyi Wang, Sihang Jiang, Jiaqing Liang, Xiaojun Meng, Jiansheng Wei, Yanghua Xiao 2025, AAAI

  5. Structured Reasoning for Large Language Models
    Jinyi Han, Zixiang Di, Zishang Jiang, Ying Liao, Jiaqing Liang, Jie Wang, Zheming Yang, Zhi Li, Yongqi Wang, Xiaofeng Ji, Yanghua Xiao, Miao Liu 2026, ACL (Under Review, Work Done in ByteDance Internship)

  6. Knowing How Certain It Is: Confidence Estimation Throughout LLM Generation
    Jinyi Han, Tingyun Li, Shisong Chen, Xinyi Wang, Jiaqing Liang, Yanghua Xiao 2026, Artificial Intelligence (Under Review)

  7. Small Language Model Can Self-correct
    Haixia Han, Jiaqing Liang, Jie Shi, Qianyu He, Yanghua Xiao 2024, AAAI

  8. Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models
    Chengyu Du, Jinyi Han, Yizhou Ying, Aili Chen, Qianyu He, Haokun Zhao, Haoran Guo, Sirui Xia, Jiaqing Liang, Zulong Chen, Liangyue Li, Yanghua Xiao 2025, ICLR

  9. CDS: Data Synthesis Method Guided by Cognitive Diagnosis Theory
    Haokun Zhao, Jinyi Han, Jiaqing Liang, Xiaojun Meng, Jiansheng Wei, Yanghua Xiao 2025, ACL

  10. Not All Negative Samples Are Equal: LLMs Learn Better from Plausible Reasoning
    Zixiang Di, Jinyi Han, Shuo Zhang, Ying Liao, Zhi Li, Miao Liu, Xiaofeng Ji, Yongqi Wang, Zheming Yang, Ming Gao, Bingdong Li, Jie Wang 2026, ICML (Under Review, Equal Contribution, Work Done in ByteDance Internship)

  11. From Outcomes to Actions: Leveraging Hindsight for Long-Horizon Language Agent Training
    Zishang Jiang, Tingyun Li, Jinyi Han, Xinyi Wang, Sihang Jiang, Yizhou Ying, Xiaojun Meng, Jiansheng Wei, Jiaqing Liang, Yanghua Xiao 2026, ICML (Under Review)

  12. CEM: A Data-Efficient Method for Large Language Models to Continue Evolving From Mistakes
    Haokun Zhao, Jinyi Han, Jie Shi, Chengyu Du, Jiaqing Liang, Yanghua Xiao 2025, CIKM

  13. Don’t Tell the Answer, Truly Guide the Reasoning During RL Rollouts
    Xinyi Wang, Jinyi Han, Zishang Jiang, Tingyun Li, Jiaqing Liang, Sihang Jiang, Zhaoqian Dai, Ma Shuguang, Fei Yu, Yanghua Xiao 2026, ACL (Under Review)

  14. ADaPT: Token-Level Decoupling for Efficient Large Reasoning Models
    Tingyun Li, Zishang Jiang, Jinyi Han, Xinyi Wang, Sihang Jiang, Han Xia, Zhaoqian Dai, Ma Shuguang, Fei Yu, Jiaqing Liang, Yanghua Xiao 2026, ACL (Under Review)

  15. SED: Self-evaluation decoding enhances large language models for better generation
    Ziqin Luo, Jinyi Han, Haokun Zhao, Guochao Jiang, Chengyu Du, Tingyun Li, Jiaqing Liang, Yanghua Xiao 2024, Preprint

  16. Dynamic Clustering Based Contextual Combinatorial Multi-armed Bandit for Online Recommendation
    Cairong Yan, Haixia Han, Yanting Zhang, Dandan Zhu, Yongquan Wan 2022, Knowledge-Based Systems

  17. CoCoB: Adaptive Collaborative Combinatorial Bandits for Online Recommendation
    Cairong Yan, Jinyi Han❤, Jin Ju, Yanting Zhang, Zijian Wang, Xuan Shao 2025, Dasfaa

  18. Thompson Sampling with Time-Varying Reward for Contextual Bandits
    Cairong Yan, Hualu Xu, Haixia Han, Yanting Zhang, Zijian Wang 2023, Dasfaa

Projects

  • Continuous Evolution and Knowledge Update for LLMs via Cognitive Diagnosis Huawei | Sep 2024 – Feb 2025 | Contributor

  • Code Intelligence Technology Solution Based on Large Language Models Huawei | Nov 2023 – Nov 2024 | Contributor

  • Data Science in Large Language Models Huawei Noah’s Ark Lab | Dec 2024 – Present | Contributor

  • Continuous Improvement Technologies for LLMs ECNU Academic Innovation Promotion Program for Excellent Doctoral Students | Oct 2024 – Present | Principal Investigator

  • Research on Key Technologies for Intelligent Textbook Question Answering Based on Large Language Models East China Normal University | Dec 2024 – Present | Principal Investigator

  • Training KW-CuteGPT Knowledge Works Lab | Apr 2023 – Aug 2023 | Contributor

Awards & Honors

  • Dec 2025: Huawei Spark Award (Fast-Slow Thinking Model Training)
  • Oct 2024: Huawei Spark Award (Large Language Model Continuous Evolution & Knowledge Updating via Cognitive Diagnosis)
  • Feb 2023:Outstanding Master’s Degree Thesis at Donghua University (3%)
  • Mar 2023:Outstanding Graduate of Shanghai (5%)
  • Oct 2020 - Oct 2022: Academic Scholarship First Prize(1%)
  • Jul 2021: Outstanding Bachelor’s Degree Thesis (Graduation Design) in Henan Province (3%)