Zhengbo Jiao

🎓 Third-year B.S. Student / Researcher

School of Computer Science and Technology
Shanghai University of Finance and Economics
Shanghai, China

Email: frostlin2005 [at] gmail.com
Wechat: Yuan2you_

Biography

Hello! I am a third-year B.S. student in Computer Science and Technology at Shanghai University of Finance and Economics. I was fortunate to be mentored by Shaobo Wang (Hunyuan Scholar) and advised by Prof. Linfeng Zhang at Shanghai Jiao Tong University's School of Artificial Intelligence, where I worked as a research intern.

Currently, I am a research intern at the AIDATA Team, Alibaba Group, closely collaborating with the Qwen Team. I also collaborate closely with Prof. Meng Han at ZJU, as well as Prof. Yunpu Ma at LMU.

My research focuses on efficient methods to optimize Agentic AI across its lifecycle (training, inference, and evaluation), spanning reasoning and action, where verifiable, call-free synthesis across CPT/SFT/RL and efficient synthetic environments jointly enable efficient self-evolution, with quality and efficiency jointly optimized.

I am actively seeking Ph.D. positions for Fall 2027. If you have any opportunities or would like to discuss potential collaboration, please feel free to contact me by email. I welcome any opportunities and look forward to hearing from you!

Feel free to contact me by email if you are interested in discussing or collaborating with me.

News

[2026] One paper submitted to COLM 2026, good luck!
[2026] Socratic-Geo accepted by CVPR 2026!
[2026] Two papers submitted to ECCV 2026, good luck!
[2026] There papers submitted to ICML 2026, good luck!
[2026] Two papers submitted to CVPR 2026, good luck!
[2026] One paper submitted to ICLR 2026, good luck!
[2025] Joined AIDATA Team, Alibaba Group as research intern!
[2025] Received National Endeavor Scholarship!
[2025] Joined School of AI, SJTU as research intern!
[2024] Received National Endeavor Scholarship!

Education

Shanghai University of Finance and Economics, Shanghai

B.S. Candidate in Computer Science and Technology

2023 - 2027 (Expected)

Industrial Experience

AIDATA Team, Alibaba Group

Research Intern

Focus: LLM/MLLM Data Construction & Evaluation; Collaborated with Qwen Team

May 2025 - Present

Research Internship

	MMLab, The Chinese University of Hong Kong Research Intern Advisor: Prof. Xiangyu Yue; Focus: Visual Agentic AI March 2026 - Present

	School of Artificial Intelligence, Shanghai Jiao Tong University Research Intern Advisor: Prof. Linfeng Zhang; Focus: Efficient AI July 2025 - Present

	College of Computer Science and Technology, Zhejiang University Research Intern Advisors: Prof. Dezhang Kong & Prof. Meng Han; Focus: LLM Reasoning March 2025 - July 2025

Representative Publications


	Socratic-Zero: Bootstrapping Reasoning via Data-Free Agent Co-evolution Zhengbo Jiao, Shaobo Wang, Zifan Zhang, Yilang Peng, Xu Ze, Boyu Yang, Wei Wang, Hu Wei†, Linfeng Zhang† arXiv preprint arXiv:2509.24726. Reported by Machine Heart. [arXiv]

	Agentic Proposing: Enhancing LLM Reasoning via Compositional Skill Synthesis Zhengbo Jiao, Shaobo Wang, Zifan Zhang, Xuan Ren, Wei Wang, Bing Zhao, Hu Wei†, Linfeng Zhang† arXiv preprint arXiv:2602.03279. [arXiv]

	Socratic-Geo: Synthetic Data Generation and Geometric Reasoning via Multi-Agent Interaction Zhengbo Jiao, Shaobo Wang, Zifan Zhang, Wei Wang, Bing Zhao, Hu Wei†, Linfeng Zhang† Accepted by CVPR 2026. arXiv preprint arXiv:2602.03414. [arXiv]

	Policy of Thoughts: Scaling LLM Reasoning via Test-time Policy Evolution Zhengbo Jiao, Hongyu Xian, Qinglong Wang, Yunpu Ma, Zhebo Wang, Zifan Zhang, Dezhang Kong, Meng Han† arXiv preprint arXiv:2601.20379. [arXiv]

	Credit Where It's Due: Cross-Modality Connectivity Drives Precise RL for MLLM Reasoning Zhengbo Jiao, Shaobo Wang, Zifan Zhang, Wei Wang, Bing Zhao, Hu Wei†, Linfeng Zhang† arXiv preprint arXiv:2602.11455. [arXiv]

	GPRM: Global Perspective Process Reward Model via Context-Aware Credit Assignment Zifan Zhang, Zhengbo Jiao*, Shaobo Wang, Wei Wang, Bing Zhao, Cheng fang, Xiaoxiao Xu, Hu Wei, Linfeng Zhang Blog.* [blog]

Honors & Awards

[2025] First-Class People's Scholarship

[2024 & 2025] National Endeavor Scholarship

[2024] Second Prize, Shanghai Mathematical Modeling Competition

[2024] Second Prize, Shanghai Mathematics Competition

[2023] First Prize, Gansu Chinese Mathematical Olympiad (CMO)

Academic Service

Conference Reviewer:
International Conference on Learning Representations (ICLR) Workshop, 2026
Annual Meeting of the Association for Computational Linguistics (ACL), 2026
Annual Meeting of the Association for Computational Linguistics (ACL) Industry Track, 2026
International Conference on Machine Learning (ICML), 2026
European Conference on Computer Vision (ECCV), 2026


	Socratic-Zero: Bootstrapping Reasoning via Data-Free Agent Co-evolution Zhengbo Jiao, Shaobo Wang, Zifan Zhang, Yilang Peng, Xu Ze, Boyu Yang, Wei Wang, Hu Wei†, Linfeng Zhang† arXiv preprint arXiv:2509.24726. Reported by Machine Heart. [arXiv]

	Agentic Proposing: Enhancing LLM Reasoning via Compositional Skill Synthesis Zhengbo Jiao, Shaobo Wang, Zifan Zhang, Xuan Ren, Wei Wang, Bing Zhao, Hu Wei†, Linfeng Zhang† arXiv preprint arXiv:2602.03279. [arXiv]

	Socratic-Geo: Synthetic Data Generation and Geometric Reasoning via Multi-Agent Interaction Zhengbo Jiao, Shaobo Wang, Zifan Zhang, Wei Wang, Bing Zhao, Hu Wei†, Linfeng Zhang† Accepted by CVPR 2026. arXiv preprint arXiv:2602.03414. [arXiv]

	Policy of Thoughts: Scaling LLM Reasoning via Test-time Policy Evolution Zhengbo Jiao, Hongyu Xian, Qinglong Wang, Yunpu Ma, Zhebo Wang, Zifan Zhang, Dezhang Kong, Meng Han† arXiv preprint arXiv:2601.20379. [arXiv]

	Credit Where It's Due: Cross-Modality Connectivity Drives Precise RL for MLLM Reasoning Zhengbo Jiao, Shaobo Wang, Zifan Zhang, Wei Wang, Bing Zhao, Hu Wei†, Linfeng Zhang† arXiv preprint arXiv:2602.11455. [arXiv]

	GPRM: Global Perspective Process Reward Model via Context-Aware Credit Assignment Zifan Zhang, Zhengbo Jiao*, Shaobo Wang, Wei Wang, Bing Zhao, Cheng fang, Xiaoxiao Xu, Hu Wei, Linfeng Zhang Blog.* [blog]