Yifan Zhang

PhD student at Princeton University and Princeton AI Lab Fellow, focusing on Large Language Models and Multimodal Foundation Models, especially Language Modeling and Pretraining, LLM Reasoning and Reinforcement Learning

About Me

I am a PhD student at Princeton University and a Princeton AI Lab Fellow, working with Prof. Mengdi Wang, Prof. Andrew Yao, and Prof. Quanquan Gu, where my research focuses on building scalable and capable large language models (LLMs) and multimodal foundation models. My work explores how to improve LLM reasoning, develop new attention mechanisms, position encodings, and model architectures, and align their behavior with human preferences through general preference models.

I have also been a visiting PhD student at the UCLA AGI Lab and IIIS, Tsinghua University, and as a Top Seed researcher with the Seed LLM (Foundation Model) Team, working on LLM and MLLM pretraining and scaling. Previously, I earned a Master’s degree and PhD candidacy in Computer Science from IIIS at Tsinghua University, working with Prof. Andrew Yao, and a Bachelor of Science in Mathematics and Computer Science from Yuanpei College at Peking University.

I am currently exploring opportunities at Frontier AI Labs and would be pleased to discuss potential collaborations via .

Research Interests

Language Modeling and Pretraining
LLM Reasoning and Reinforcement Learning
Physics of Deep Learning

You can find my publications on Google Scholar.

You can find my blog posts at Yifan's Blog.

Selected Works

Deep Delta Learning

Yifan Zhang, Yifeng Liu, Mengdi Wang, Quanquan Gu

arXiv preprint

[Project Page] [Website]

[GRAPE] Group Representational Position Encoding

Yifan Zhang, Zixiang Chen, Yifeng Liu, Zhen Qin, Huizhuo Yuan, Kangping Xu, Quanquan Gu, Andrew C Yao

arXiv:2512.07805

[Project Page] [Website]

[HLA] Higher-order Linear Attention

Yifan Zhang, Zhen Qin, Quanquan Gu

arXiv:2510.27258

[Project Page] [Website]

[RPG] On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning

Yifan Zhang*, Yifeng Liu*, Huizhuo Yuan, Yang Yuan, Quanquan Gu, Andrew C Yao

Conference on Neural Information Processing Systems (NeurIPS 2025 MATH-AI Workshop); See also Thinking Machines Tinker and DeepSeek V3.2

[Project Page] [Website]

[TPA] Tensor Product Attention Is All You Need

Yifan Zhang*, Yifeng Liu*, Huizhuo Yuan, Zhen Qin, Yang Yuan, Quanquan Gu, Andrew C Yao

Conference on Neural Information Processing Systems (NeurIPS 2025 Spotlight)

[Project Page] [Website]

[GPM] Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment

Yifan Zhang*, Ge Zhang*, Yue Wu*, Kangping Xu, Quanquan Gu

International Conference on Machine Learning (ICML 2025)

[Project Page] [Website]

[AutoMathText] Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts

Yifan Zhang*, Yifan Luo*, Yang Yuan, Andrew C Yao

Findings of the Association for Computational Linguistics (ACL 2025 Findings)

[Project Page] [Website]

[MMIQC] Augmenting Math Word Problems via Iterative Question Composing

Haoxiong Liu*, Yifan Zhang*, Yifan Luo, Andrew C Yao

AAAI Conference on Artificial Intelligence (AAAI 2025)

[Project Page] [Website]

[Proposer-Verifier] Cumulative Reasoning with Large Language Models

Yifan Zhang*, Jingqin Yang*, Yang Yuan, Andrew C Yao

Transactions on Machine Learning Research (TMLR)

[Project Page] [Website]

(* denotes equal contribution)

Recent Publications

View all publications

Tensor Product Attention Is All You Need

Yifan Zhang*, Yifeng Liu*, Huizhuo Yuan, Zhen Qin, Yang Yuan, Quanquan Gu, Andrew C Yao

Conference on Neural Information Processing Systems (NeurIPS 2025 Spotlight)

[Project Page] [Website]

Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts

Yifan Zhang*, Yifan Luo*, Yang Yuan, Andrew C Yao

Findings of the Association for Computational Linguistics (ACL 2025 Findings)

[Project Page] [Website]

Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment

Yifan Zhang*, Ge Zhang*, Yue Wu*, Kangping Xu, Quanquan Gu

International Conference on Machine Learning (ICML 2025)

[Project Page] [Website]

Augmenting Math Word Problems via Iterative Question Composing

Haoxiong Liu*, Yifan Zhang*, Yifan Luo, Andrew C Yao

AAAI Conference on Artificial Intelligence (AAAI 2025)

[Project Page] [Website]

Beyond Squared Error: Exploring Loss Design for Enhanced Training of Generative Flow Networks

Rui Hu*, Yifan Zhang*, Zhuoran Li, Longbo Huang

International Conference on Learning Representations (ICLR 2025 Spotlight)

(* denotes equal contribution)

Recent Workshops

View all workshops

On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning

Yifan Zhang*, Yifeng Liu*, Huizhuo Yuan, Yang Yuan, Quanquan Gu, Andrew C Yao

Conference on Neural Information Processing Systems (NeurIPS 2025 MATH-AI Workshop); See also Thinking Machines Tinker

[Project Page] [Website]

Training and Evaluating Language Models with Template-based Data Generation

Yifan Zhang

International Conference on Learning Representations (ICLR 2025) DATA-FM Workshop

[Project Page] [Website]

Meta Prompting for AI Systems

Yifan Zhang, Yang Yuan, Andrew C Yao

International Conference on Learning Representations (ICLR 2024) BGPT Workshop

[Project Page] [Website]

Recent Preprints & Technical Reports

View all preprints

Deep Delta Learning

Yifan Zhang, Yifeng Liu, Mengdi Wang, Quanquan Gu

arXiv preprint

[Project Page] [Website]

Seed1.8 Model Card: Towards Generalized Real-World Agency

Seed Team

Technical Report, December 17, 2025

[Project Page] [Website]

Group Representational Position Encoding

Yifan Zhang, Zixiang Chen, Yifeng Liu, Zhen Qin, Huizhuo Yuan, Kangping Xu, Quanquan Gu, Andrew C Yao

arXiv:2512.07805

[Project Page] [Website]

Higher-order Linear Attention

Yifan Zhang, Zhen Qin, Quanquan Gu

arXiv:2510.27258

[Project Page] [Website]

Language Server CLI Empowers Language Agents with Process Rewards

Yifan Zhang, et al.

arXiv:2510.22907, see also Claude Code v2.0.74

[Project Page] [Website]

Blog Highlights

Revisiting Variance Reduction in Policy Gradients for LLM Reinforcement Learning

Yifan Zhang, Quanquan Gu

Yifan's Blog, December 27, 2025

How to Train Frontier Models Effectively?

Yifan Zhang

Yifan's Blog, December 21, 2025

Rethinking SWA: Why Short Sliding Window Attention Will Replace ShortConv

Yifan Zhang

Yifan's Blog, December 16, 2025

Matrix Exponential Attention

Yifan Zhang

Yifan's Blog, December 15, 2025

Professional Activities

Teaching

Teaching Assistant, Machine Learning for Yao class, IIIS, Tsinghua University

Academic Services

Conference Reviewer: NeurIPS, ICLR, ICML, COLM, AAAI, AISTATS
Journal Reviewer: ACM TKDD, Neural Computing, Neural Networks