Publications
Monographs
Towards a Theoretical Foundation of Policy Optimization for Learning Control Policies
B. Hu, K. Zhang, N. Li, M. Mesbahi, M. Fazel, T. Başar
Annual Review of Control, Robotics, and Autonomous Systems, 2023 (Invited & Refereed).
Independent Learning in Stochastic Games
A. Ozdaglar*, M. O. Sayin*, K. Zhang*
International Congress of Mathematicians (ICM), 2022 (Invited).
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
K. Zhang, Z. Yang, and T. Başar
Springer Studies in Systems, Decision and Control, Handbook on RL and Control, 2020 (Invited).
Journal Papers
Partially observable multi-agent reinforcement learning with information sharing
Offline reinforcement learning via linear programming with error-bound induced constraints
A. Ozdaglar*, S. Pattathil*, J. Zhang*, and K. Zhang*
Mathematics of Operations Research (MathOR) (under review) (Short version appeared at ICML 2023).
Last-iterate convergence of payoff-based independent learning in zero-sum stochastic games
Z. Chen, K. Zhang, E. Mazumdar, A. Ozdaglar, A. Wierman
Operations Research (OR) (under review) (Short version appeared at NeurIPS 2023).
Policy optimization for H-2 linear control with H-infinity robustness guarantee: Implicit regularization and global convergence
K. Zhang, B. Hu, and T. Başar
SIAM Journal on Control and Optimization (SICON), 2021.
Model-free non-stationary RL: Near-optimal regret and applications in multi-agent RL and inventory control
W. Mao, K. Zhang, R. Zhu, D. Simchi-Levi, and T. Başar
Management Science (MS), 2023.
Model-based multi-agent RL in zero-sum Markov games with near-optimal sample complexity
K. Zhang, S.M. Kakade, T. Başar, and L.F. Yang
Journal of Machine Learning Research (JMLR) 2023 (Short version appeared at NeurIPS 2020 (Spotlight)).
Finite-sample analysis for decentralized batch multi-agent reinforcement learning with networked agents
K. Zhang, Z. Yang, H. Liu, T. Zhang, and T. Başar
IEEE Trans. on Automatic Control (TAC), 2021.
Global convergence of policy gradient methods to (almost) locally optimal policies
K. Zhang, A. Koppel, H. Zhu, and T. Başar
SIAM Journal on Control and Optimization (SICON), 2020.
Distributed learning of average belief over networks using sequential observations
K. Zhang, Y. Liu, J. Liu, M. Liu, and T. Başar
Automatica, 2020.
Selected Conference Papers
Do LLM agents have regret? A case study in online learning and games
C. Park*, X. Liu*, A. Ozdaglar, and K. Zhang
Intl. Conf. on Learning Represent. (ICLR), 2025. (Preliminary version Oral in How Far Are We From AGI Workshop, ICLR 2024)
Provable partially observable reinforcement learning with privileged information
Y. Cai*, X. Liu*, A. Oikonomou*, K. Zhang*
Neural Info. Process. Systems (NeurIPS), 2024.
Multi-player zero-sum Markov games with networked separable interactions
C. Park*, K. Zhang*, and A. Ozdaglar
Neural Info. Process. Systems (NeurIPS), 2023.
The complexity of Markov equilibrium in stochastic games
C. Daskalakis*, N. Golowich*, and K. Zhang*
Conference on Learning Theory (COLT), 2023.
Tackling combinatorial distribution shift: A matrix completion perspective
M. Simchowitz, A. Gupta, and K. Zhang
Conference on Learning Theory (COLT), 2023.
Breaking the curse of multiagents in a large state space: RL in Markov games with independent linear function approximation
Q. Cui, K. Zhang, and S. Du
Conference on Learning Theory (COLT), 2023.
Can direct latent model learning solve linear quadratic Gaussian control?
Y. Tian, K. Zhang, R. Tedrake, and S. Sra
Learning for Dynamics & Control (L4DC) (Oral), 2023.
Independent policy gradient for large-scale Markov potential games: Sharper rates, function approximation, and game-agnostic convergence
D. Ding*, C. Wei*, K. Zhang*, and M. Jovanović
Intl. Conf. on Machine Learning (ICML) (Long Oral), 2022.
Do differentiable simulators give better policy gradients?
H. T. Suh, M. Simchowitz, K. Zhang, and R. Tedrake
Intl. Conf. on Machine Learning (ICML) (Outstanding Paper Award), 2022.
Decentralized Q-Learning in zero-sum Markov games
M. O. Sayin*, K. Zhang*, D. Leslie, T. Başar, and A. Ozdaglar
Neural Info. Process. Systems (NeurIPS), 2021.
Learning safe multi-agent control with decentralized neural barrier certificates
Z. Qin, K. Zhang, Y. Chen, J. Chen, and C. Fan
Intl. Conf. on Learning Represent. (ICLR), 2021.
Natural policy gradient primal-dual method for constrained Markov decision processes
D. Ding, K. Zhang, T. Başar, and M.R. Jovanovic
Neural Info. Process. Systems (NeurIPS), 2020.
Policy optimization for H-2 linear control with H-infinity robustness guarantee: Implicit regularization and global convergence
K. Zhang, B. Hu, and T. Başar
Learning for Dynamics & Control (L4DC) (Oral), 2020 (Long version accepted to SICON).
Fully decentralized multi-agent reinforcement learning with networked agents
K. Zhang, Z. Yang, H. Liu, T. Zhang, and T. Başar
Intl. Conf. on Machine Learning (ICML), 2018.
|