My research interests lie in Natural Language Processing. In particular, I am working on two directions:
I am also fortunate to contribute to widely-used pretrain/eval datasets (MegaMath, OSWorld, TxT360, BigCodeBench). Before the LLM era, I worked on structured (table, database, spreadsheet) data reasoning (FORTAP, HiTab).
My long-term goal is understanding AI to advance AI.
|
-
[Sep 2025] |
K2-Think is out, an open model trained on fully open SFT+RL data reaching 90.8% on AIME24. |
-
[June 2025] |
Excited to share Guru, studying RL for reasoning beyond math and code. |
-
[Sep 2024] |
Started my PhD at UCSD! |
|
Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective
Zhoujun Cheng*, Shibo Hao*, Tianyang Liu*, Fan Zhou, Yutao Xie, Feng Yao, Yuexin Bian, Yonghao Zhuang, Nilabjo Dey, Yuheng Zha, Yi Gu, Kun Zhou, Yuqi Wang, Yuan Li, Richard Fan, Jianshu She, Chengqian Gao, Abulhair Saparov, Haonan Li, Taylor W. Killian, Mikhail Yurochkin, Zhengzhong Liu, Eric P. Xing, Zhiting Hu
arXiv preprint
pdf |
website |
dataset |
code |
model
TL;DR: Investigating how RL for LLM reasoning transfers across six domains.
|
MegaMath: Pushing the Limits of Open Math Corpora
Fan Zhou*, Zengzhi Wang*, Nikhil Ranjan, Zhoujun Cheng, Liping Tang, Guowei He, Zhengzhong Liu, Eric P. Xing
COLM 2025
pdf |
dataset |
code
TL;DR: A 200B-scale high-quality mathematical mid-training dataset.
|
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer
Environments
Tianbao Xie, Danyang Zhang, Jixuan Chen, Xiaochuan Li, Siheng Zhao, Ruisheng Cao, Toh Jing Hua,
Zhoujun Cheng, Dongchan Shin, Fangyu Lei, Yitao Liu, Yiheng Xu, Shuyan Zhou, Silvio
Savarese, Caiming Xiong, Victor Zhong, Tao Yu
NeurIPS 2024 Datasets & Benchmarks, used by OpenAI and Anthropic.
pdf |
website |
code |
data |
data viewer
TL;DR: A computer use agent benchmark.
|
What Are Tools Anyway? A Survey from the Language Model Perspective
Zhiruo Wang, Zhoujun Cheng, Hao Zhu, Daniel Fried, Graham Neubig
COLM 2024
pdf |
collection
TL;DR: An attempt to clarify and discuss some ambiguities in LM tool-using papers.
|
OpenAgents: An Open Platform for Language Agents in the Wild
Tianbao Xie*, Fan Zhou*, Zhoujun Cheng*, Peng Shi*, Luoxuan Weng*, Yitao Liu*, Toh
Jing Hua, Junning Zhao, Qian Liu, Che Liu, Leo Z. Liu, Yiheng Xu, Hongjin Su, Dongchan Shin, Caiming
Xiong, Tao Yu
COLM 2024, 4k github stars, 7k demo users
pdf |
code |
demo |
docs
TL;DR: An open platform for using, hosting, and building language agents.
|
Lemur: Harmonizing Natural Language and Code for Language Agents
Yiheng Xu*, Hongjin Su*, Chen Xing*, Boyu Mi, Qian Liu, Weijia Shi, Binyuan Hui, Fan Zhou, Yitao Liu,
Tianbao Xie, Zhoujun Cheng, Siheng Zhao, Lingpeng Kong, Bailin Wang, Caiming Xiong,
Tao Yu
ICLR 2024 (Spotlight)
pdf |
code |
checkpoint
TL;DR: A pretrained 70B agent model with balanced code-text corpora.
|
Binding Language Models in Symbolic Languages
Zhoujun Cheng*, Tianbao Xie*, Peng Shi, Chengzu Li, Rahul Nadkarni, Yushi Hu, Caiming
Xiong, Dragomir Radev, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Tao Yu
ICLR 2023 (Spotlight)
pdf |
code |
demo
TL;DR: A training-free neural-symbolic framework mapping task inputs to programs of LLM calls + symbolic languages.
|
HiTab: A Hierarchical Table Dataset for Question Answering and Natural Language
Generation
Zhoujun Cheng*, Haoyu Dong*, Zhiruo Wang*, Ran Jia, Jiaqi Guo, Yan Gao, Shi Han,
Jian-Guang Lou, Dongmei Zhang
ACL 2022
pdf |
code |
dataset
TL;DR: A hierarchical table dataset for question answering and natural language generation.
|
FORTAP: Using Formulae for Numerical-Reasoning-Aware Table Pretraining
Zhoujun Cheng*, Haoyu Dong*, Ran Jia, Pengfei Wu, Shi Han, Fan Cheng, Dongmei Zhang
ACL 2022
pdf |
code
TL;DR: Adopting spreadsheet formulas to enhance numerical reasoning skills of table modeling.
|
Reviewer: NeurIPs, ICLR, COLM, ARR, NAACL, ACL, EMNLP, EACL
Teaching Assistant: Introduction to Programming, Introduction to Machine Learning
National Scholarship, 2018
|