Publications

See the more complete and timely updated list at Google Scholar.

Monographs

Towards a Theoretical Foundation of Policy Optimization for Learning Control Policies

B. Hu, K. Zhang, N. Li, M. Mesbahi, M. Fazel, T. Başar
Annual Review of Control, Robotics, and Autonomous Systems, 2023 (Invited & Refereed).

Independent Learning in Stochastic Games

A. Ozdaglar*, M. O. Sayin*, K. Zhang*
International Congress of Mathematicians (ICM), 2022 (Invited).

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

K. Zhang, Z. Yang, and T. Başar
Springer Studies in Systems, Decision and Control, Handbook on RL and Control, 2020 (Invited).

Journal Papers and Preprints

Partially observable multi-agent reinforcement learning with information sharing

X. Liu and K. Zhang
Under Review (Short version appeared at ICML 2023).

Offline reinforcement learning via linear programming with error-bound induced constraints

A. Ozdaglar*, S. Pattathil*, J. Zhang*, and K. Zhang*
Under Review (Short version appeared at ICML 2023).

Last-iterate convergence of payoff-based independent learning in zero-sum stochastic games

Z. Chen, K. Zhang, E. Mazumdar, A. Ozdaglar, A. Wierman
Under Review (Short version appeared at NeurIPS 2023).

Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs

D. Ding, K. Zhang, J. Duan, T. Başar, and M.R. Jovanovic
Journal of Machine Learning Research (JMLR) 2025 (Short version appeared at NeurIPS 2020).

Model-free non-stationary RL: Near-optimal regret and applications in multi-agent RL and inventory control

W. Mao, K. Zhang, R. Zhu, D. Simchi-Levi, and T. Başar
Management Science (MS), 2023.

Policy optimization for H-2 linear control with H-infinity robustness guarantee: Implicit regularization and global convergence

K. Zhang, B. Hu, and T. Başar
SIAM Journal on Control and Optimization (SICON), 2021 (Short version appeared at L4DC 2020 (Oral)).

Model-based multi-agent RL in zero-sum Markov games with near-optimal sample complexity

K. Zhang, S.M. Kakade, T. Başar, and L.F. Yang
Journal of Machine Learning Research (JMLR) 2023 (Short version appeared at NeurIPS 2020 (Spotlight)).

Finite-sample analysis for decentralized batch multi-agent reinforcement learning with networked agents

K. Zhang, Z. Yang, H. Liu, T. Zhang, and T. Başar
IEEE Trans. on Automatic Control (TAC), 2021.

Global convergence of policy gradient methods to (almost) locally optimal policies

K. Zhang, A. Koppel, H. Zhu, and T. Başar
SIAM Journal on Control and Optimization (SICON), 2020.

Distributed learning of average belief over networks using sequential observations

K. Zhang, Y. Liu, J. Liu, M. Liu, and T. Başar
Automatica, 2020.

Conference Papers

Do LLM agents have regret? A case study in online learning and games

C. Park*, X. Liu*, A. Ozdaglar, and K. Zhang
Intl. Conf. on Learning Represent. (ICLR), 2025. (Preliminary version Oral in How Far Are We From AGI Workshop, ICLR 2024)

Provable partially observable reinforcement learning with privileged information

Y. Cai*, X. Liu*, A. Oikonomou*, K. Zhang*
Neural Info. Process. Systems (NeurIPS), 2024.

Two-timescale Q-learning with function approximation in zero-sum stochastic games

Z. Chen, K. Zhang, E. Mazumdar, A. Ozdaglar, A. Wierman
ACM Conf. on Econ. and Comput. (EC), 2024.

Multi-player zero-sum Markov games with networked separable interactions

C. Park*, K. Zhang*, and A. Ozdaglar
Neural Info. Process. Systems (NeurIPS), 2023.

The complexity of Markov equilibrium in stochastic games

C. Daskalakis*, N. Golowich*, and K. Zhang*
Conference on Learning Theory (COLT), 2023.

Tackling combinatorial distribution shift: A matrix completion perspective

M. Simchowitz, A. Gupta, and K. Zhang
Conference on Learning Theory (COLT), 2023.

Breaking the curse of multiagents in a large state space: RL in Markov games with independent linear function approximation

Q. Cui, K. Zhang, and S. Du
Conference on Learning Theory (COLT), 2023.

Can direct latent model learning solve linear quadratic Gaussian control?

Y. Tian, K. Zhang, R. Tedrake, and S. Sra
Learning for Dynamics & Control (L4DC) (Oral), 2023.

Independent policy gradient for large-scale Markov potential games: Sharper rates, function approximation, and game-agnostic convergence

D. Ding*, C. Wei*, K. Zhang*, and M. Jovanović
Intl. Conf. on Machine Learning (ICML) (Long Oral), 2022.

Do differentiable simulators give better policy gradients?

H. T. Suh, M. Simchowitz, K. Zhang, and R. Tedrake
Intl. Conf. on Machine Learning (ICML) (Outstanding Paper Award), 2022.

Decentralized Q-Learning in zero-sum Markov games

M. O. Sayin*, K. Zhang*, D. Leslie, T. Başar, and A. Ozdaglar
Neural Info. Process. Systems (NeurIPS), 2021.

Derivative-free policy optimization for linear risk-sensitive and robust control design: Implicit regularization and sample complexity

K. Zhang, X. Zhang, B. Hu, and T. Başar
Neural Info. Process. Systems (NeurIPS), 2021.

Model-based multi-agent RL in zero-sum Markov games with near-optimal sample complexity

K. Zhang, S.M. Kakade, T. Başar, and L.F. Yang
Neural Info. Process. Systems (NeurIPS) (Spotlight), 2020.

Policy optimization for H-2 linear control with H-infinity robustness guarantee: Implicit regularization and global convergence

K. Zhang, B. Hu, and T. Başar
Learning for Dynamics & Control (L4DC) (Oral), 2020.

Fully decentralized multi-agent reinforcement learning with networked agents

K. Zhang, Z. Yang, H. Liu, T. Zhang, and T. Başar
Intl. Conf. on Machine Learning (ICML), 2018.