Yifan Zhang

About Me

I am a PhD candidate at Princeton University and a Princeton AI Lab Fellow, working with Prof. Mengdi Wang, Prof. Andy Yao, and Prof. Quanquan Gu, where my research focuses on building scalable and capable large language models (LLMs) and multimodal foundation models. My work explores methods to improve LLM reasoning via reinforcement learning, data curation and algorithms for foundation models, as well as the development of new attention mechanisms, positional encodings, and model architectures. Previously, I was a visiting PhD student at the University of California, Los Angeles.

Research Interests

LLM Reasoning and Reinforcement Learning
Language Modeling and Pretraining

You can find my publications on Google Scholar.

You can find my blog posts at Yifan's Blog.

Selected Works

FlashSampling: Fast and Memory-Efficient Exact Sampling

Tomas Ruiz*, Zhen Qin*, Yifan Zhang†, Xuyang Shen, Yiran Zhong, Mengdi Wang†

Preprint

[Project Page] [Website]

Deep Delta Learning

Yifan Zhang, Yifeng Liu, Mengdi Wang, Quanquan Gu

arXiv:2601.00417

[Project Page] [Website]

[GRAPE] Group Representational Position Encoding

Yifan Zhang, Zixiang Chen, Yifeng Liu, Zhen Qin, Huizhuo Yuan, Kangping Xu, Yang Yuan, Quanquan Gu, Andrew C Yao

International Conference on Learning Representations (ICLR 2026)

[Project Page] [Website]

[RPG] On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning

Yifan Zhang*, Yifeng Liu*, Huizhuo Yuan, Yang Yuan, Quanquan Gu, Andrew C Yao

International Conference on Learning Representations (ICLR 2026); See also Thinking Machines Tinker

[Project Page] [Website]

[TPA] Tensor Product Attention Is All You Need

Yifan Zhang*, Yifeng Liu*, Huizhuo Yuan, Zhen Qin, Yang Yuan, Quanquan Gu, Andrew C Yao

Conference on Neural Information Processing Systems (NeurIPS 2025 Spotlight)

[Project Page] [Website]

[GPM] Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment

Yifan Zhang*, Ge Zhang*, Yue Wu*, Kangping Xu, Quanquan Gu

International Conference on Machine Learning (ICML 2025)

[Project Page] [Website]

[AutoMathText] Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts

Yifan Zhang*, Yifan Luo*, Yang Yuan, Andrew C Yao

Findings of the Association for Computational Linguistics (ACL 2025 Findings)

[Project Page] [Website]

[Proposer-Verifier] Cumulative Reasoning with Large Language Models

Yifan Zhang*, Jingqin Yang*, Yang Yuan, Andrew C Yao

Transactions on Machine Learning Research (TMLR)

[Project Page] [Website]

(* denotes equal contribution)

Recent Publications

View all publications

Group Representational Position Encoding

Yifan Zhang, Zixiang Chen, Yifeng Liu, Zhen Qin, Huizhuo Yuan, Kangping Xu, Yang Yuan, Quanquan Gu, Andrew C Yao

International Conference on Learning Representations (ICLR 2026)

[Project Page] [Website]

On the Design of KL-Regularized Policy Gradient Algorithms for LLM Reasoning

Yifan Zhang*, Yifeng Liu*, Huizhuo Yuan, Yang Yuan, Quanquan Gu, Andrew C Yao

International Conference on Learning Representations (ICLR 2026); See also Thinking Machines Tinker and DeepSeek V3.2

[Project Page] [Website]

Tensor Product Attention Is All You Need

Yifan Zhang*, Yifeng Liu*, Huizhuo Yuan, Zhen Qin, Yang Yuan, Quanquan Gu, Andrew C Yao

Conference on Neural Information Processing Systems (NeurIPS 2025 Spotlight)

[Project Page] [Website]

Autonomous Data Selection with Zero-shot Generative Classifiers for Mathematical Texts

Yifan Zhang*, Yifan Luo*, Yang Yuan, Andrew C Yao

Findings of the Association for Computational Linguistics (ACL 2025 Findings)

[Project Page] [Website]

Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment

Yifan Zhang*, Ge Zhang*, Yue Wu*, Kangping Xu, Quanquan Gu

International Conference on Machine Learning (ICML 2025)

[Project Page] [Website]

[MMIQC] Augmenting Math Word Problems via Iterative Question Composing

Haoxiong Liu*, Yifan Zhang*, Yifan Luo, Andrew C Yao

AAAI Conference on Artificial Intelligence (AAAI 2025)

[Project Page] [Website]

Beyond Squared Error: Exploring Loss Design for Enhanced Training of Generative Flow Networks

Rui Hu*, Yifan Zhang*, Zhuoran Li, Longbo Huang

International Conference on Learning Representations (ICLR 2025 Spotlight)

(* denotes equal contribution)

Blog Highlights

Visit the blog

ShortSWA Is the Next-Generation N-gram Embedding

Yifan Zhang

Yifan's Blog, January 12, 2026

Revisiting Variance Reduction in Policy Gradients for LLM Reinforcement Learning

Yifan Zhang, Quanquan Gu

Yifan's Blog, December 27, 2025

Rethinking SWA: Why Short Sliding Window Attention Will Replace ShortConv

Yifan Zhang

Yifan's Blog, December 16, 2025

Matrix Exponential Attention

Yifan Zhang

Yifan's Blog, December 15, 2025

Professional Activities

Teaching

Teaching Assistant, Machine Learning for Yao class, IIIS, Tsinghua University

Academic Services

Conference Reviewer: NeurIPS, ICLR, ICML, COLM, AAAI, AISTATS
Journal Reviewer: ACM TKDD, Neural Computing, Neural Networks