Jinyi Han’s HomePage

About Me

I am Jinyi Han (韩槿一), formerly known as Haixia Han, a third-year Ph.D. candidate at East China Normal University, Shanghai, China. I am advised by Prof Yanghua Xiao and Dr. Jiaqing Liang at Knowledge Works Lab of Fudan University.

My current research focuses on Large Language Models (LLMs). Key research directions include:

  • Language Agents: Building agents with autonomous decision-making, continual learning, and long-horizon task execution capabilities, including self-evolution, skill distillation & reuse, agentic RL, and agent evaluation
  • Reasoning Models: Enhancing complex reasoning and tool-use capabilities of LLMs, as well as reasoning efficiency optimization
  • Post-Training: Exploring continual self-evolution, curriculum learning strategies, and reinforcement learning optimization to improve model performance

CV (Chinese)

PS: Updated as of July 2026

News

  • 2026-05: Joined Tencent TEG Hunyuan as a summer intern under the Qingyun Talent Program, working on LLM Agent evaluation.
  • 2026-05: Two papers accepted at ACL 2026 and one paper accepted at ICML 2026!
  • 2026-04: GenericAgent — a self-evolving LLM Agent system — has been open-sourced with 13K+ GitHub Stars and 4K+ community users! [Technical Report]
  • 2026-04: Our technical report on GenericAgent is now available on arXiv, detailing our token-efficient self-evolving LLM Agent framework.
  • 2026-02: Our three papers have been accepted by ICLR 2026. Looking forward to seeing everyone in Rio de Janeiro!
  • 2025-05: Our work CDS: Data Synthesis Method Guided by Cognitive Diagnosis Theory has been accepted in ACL 2025.
  • 2025-04: We have achieved autonomous tool learning for LLMs based on Reinforcement Learning (RL). Our work has been reported in jiqizhixin Main Contributor
  • 2025-02: We have implemented the GRPO algorithm and released it on GitHub. Our work has been reported in jiqizhixin. Main Contributor
  • 2025-01: Our work Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models has been accepted in ICLR 2025.
  • ## Selected Publications

    Language Agents

    GenericAgent: A Token-Efficient Self-Evolving LLM Agent via Contextual Information Density Maximization
    Jiaqing Liang, Jinyi Han, Weijia Li, Xinyi Wang, Zhoujia Zhang, Zishang Jiang, et al.
    Preprint, 2026 13K+ Stars Paper
    From Instruction Following to Skill Following: A Process-Oriented Benchmark for LLM Agents
    Jinyi Han, Yuanjian Xu, Ying Liao, Zhichao Hu, Yanghua Xiao
    Under Review, 2026
    From Outcomes to Actions: Leveraging Hindsight for Long-Horizon Language Agent Training
    Zishang Jiang, Tingyun Li, Jinyi Han, Xinyi Wang, Sihang Jiang, Yizhou Ying, Xiaojun Meng, Jiansheng Wei, Jiaqing Liang, Yanghua Xiao
    ICML 2026
    SOPFlow: Compiling Standard Operating Procedures into Repairable Executable Workflows
    Ying Liao, Ying Huang, Jinyi Han, Siyuan He, Jiaqing Liang, Yanghua Xiao
    ACL 2026

    Reasoning Models

    A Stitch in Time Saves Nine: Proactive Self-Refinement for Language Models
    Jinyi Han, Xinyi Wang, Haiquan Zhao, Tingyun Li, Zishang Jiang, Sihang Jiang, Jiaqing Liang, Xin Lin, Weikang Zhou, Zeye Sun, Fei Yu, Yanghua Xiao
    ICLR 2026
    Your Models Have Thought Enough: Training Large Reasoning Models to Stop Overthinking
    Jinyi Han, Ying Huang, Ying Liao, Haiquan Zhao, Zishang Jiang, Xinyi Wang, Xikun Lu, Guanghao Zhou, Sihang Jiang, Jiaqing Liang, Weikang Zhou, Zeye Sun, Fei Yu, Yanghua Xiao
    ICLR 2026
    Structured Reasoning for Large Language Models
    Jinyi Han, Zixiang Di, Zishang Jiang, Ying Liao, Jiaqing Liang, Jie Wang, Zheming Yang, Zhi Li, Yongqi Wang, Xiaofeng Ji, Yanghua Xiao, Miao Liu
    Under Review, 2026 · Work done at ByteDance
    Small Language Model Can Self-correct
    Haixia Han, Jiaqing Liang, Jie Shi, Qianyu He, Yanghua Xiao
    AAAI 2024
    Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models
    Chengyu Du, Jinyi Han, Yizhou Ying, Aili Chen, Qianyu He, Haokun Zhao, Haoran Guo, Sirui Xia, Jiaqing Liang, Zulong Chen, Liangyue Li, Yanghua Xiao
    ICLR 2025
    ADaPT: Token-Level Decoupling for Efficient Large Reasoning Models
    Tingyun Li, Zishang Jiang, Jinyi Han, Xinyi Wang, Sihang Jiang, Han Xia, Zhaoqian Dai, Ma Shuguang, Fei Yu, Jiaqing Liang, Yanghua Xiao
    ACL 2026

    Post-Training

    Selective Expert Guidance for Effective and Diverse Exploration in Reinforcement Learning of LLMs
    Zishang Jiang, Jinyi Han, Tingyun Li, Xinyi Wang, Sihang Jiang, Zhaoqian Dai, Ma Shuguang, Fei Yu, Jiaqing Liang, Yanghua Xiao
    ICLR 2026
    CDS: Data Synthesis Method Guided by Cognitive Diagnosis Theory
    Haokun Zhao, Jinyi Han, Jiaqing Liang, Xiaojun Meng, Jiansheng Wei, Yanghua Xiao
    ACL 2025
    Difficulty Is Not Enough: Curriculum Learning for LLMs Fine-tuning Must Consider Utility
    Zishang Jiang, Jinyi Han, Tingyun Li, Xinyi Wang, Sihang Jiang, Jiaqing Liang, Xiaojun Meng, Jiansheng Wei, Yanghua Xiao
    AAAI 2025 Oral
    Not All Negative Samples Are Equal: LLMs Learn Better from Plausible Reasoning
    Zixiang Di, Jinyi Han, Shuo Zhang, Ying Liao, Zhi Li, Miao Liu, Xiaofeng Ji, Yongqi Wang, Zheming Yang, Ming Gao, Bingdong Li, Jie Wang
    Under Review, 2026 · Equal Contribution · Work done at ByteDance
    Don't Tell the Answer, Truly Guide the Reasoning During RL Rollouts
    Xinyi Wang, Jinyi Han, Zishang Jiang, Tingyun Li, Jiaqing Liang, Sihang Jiang, Zhaoqian Dai, Ma Shuguang, Fei Yu, Yanghua Xiao
    ACL 2026
    Knowing How Certain It Is: Confidence Estimation Throughout LLM Generation
    Jinyi Han, Tingyun Li, Shisong Chen, Xinyi Wang, Jiaqing Liang, Yanghua Xiao
    Under Review, 2026