Publications
Monographs
Towards a Theoretical Foundation of Policy Optimization for Learning Control Policies
B. Hu, K. Zhang, N. Li, M. Mesbahi, M. Fazel, T. Başar
Annual Review of Control, Robotics, and Autonomous Systems, 2023 (Invited & Refereed).
Independent Learning in Stochastic Games
A. Ozdaglar*, M. O. Sayin*, K. Zhang*
International Congress of Mathematicians (ICM), 2022 (Invited).
MultiAgent Reinforcement Learning: A Selective Overview of Theories and Algorithms
K. Zhang, Z. Yang, and T. Başar
Springer Studies in Systems, Decision and Control, Handbook on RL and Control, 2020 (Invited).
Journal Papers
Policy optimization for H2 linear control with Hinfinity robustness guarantee: Implicit regularization and global convergence
K. Zhang, B. Hu, and T. Başar
SIAM Journal on Control and Optimization (SICON), 2021.
Modelfree nonstationary RL: Nearoptimal regret and applications in multiagent RL and inventory control
W. Mao, K. Zhang, R. Zhu, D. SimchiLevi, and T. Başar
Management Science (MS), 2023.
Modelbased multiagent RL in zerosum Markov games with nearoptimal sample complexity
K. Zhang, S.M. Kakade, T. Başar, and L.F. Yang
Journal of Machine Learning Research (JMLR), 2023.
Finitesample analysis for decentralized batch multiagent reinforcement learning with networked agents
K. Zhang, Z. Yang, H. Liu, T. Zhang, and T. Başar
IEEE Trans. on Automatic Control (TAC), 2021.
Global convergence of policy gradient methods to (almost) locally optimal policies
K. Zhang, A. Koppel, H. Zhu, and T. Başar
SIAM Journal on Control and Optimization (SICON), 2020.
Distributed learning of average belief over networks using sequential observations
K. Zhang, Y. Liu, J. Liu, M. Liu, and T. Başar
Automatica, 2020.
Projected stochastic primaldual method for constrained online learning with kernels
A. Koppel*, K. Zhang*, H. Zhu, and T. Başar
IEEE Trans. on Signal Processing (TSP), vol. 67 , no. 10 , pp. 2528  2542, May, 2019.
Selected Conference Papers
Multiplayer zerosum Markov games with networked separable interactions
C. Park*, K. Zhang*, and A. Ozdaglar
Neural Info. Process. Systems (NeurIPS), 2023.
The complexity of Markov equilibrium in stochastic games
C. Daskalakis*, N. Golowich*, and K. Zhang*
Conference on Learning Theory (COLT), 2023.
Tackling combinatorial distribution shift: A matrix completion perspective
M. Simchowitz, A. Gupta, and K. Zhang
Conference on Learning Theory (COLT), 2023.
Breaking the curse of multiagents in a large state space: RL in Markov games with independent linear function approximation
Q. Cui, K. Zhang, and S. Du
Conference on Learning Theory (COLT), 2023.
Partially observable multiagent RL with (quasi)efficiency: The blessing of information sharing
Revisiting the linearprogramming framework for offline RL with general function approximation
A. Ozdaglar*, S. Pattathil*, J. Zhang*, and K. Zhang*
Intl. Conf. on Machine Learning (ICML), 2023.
Can direct latent model learning solve linear quadratic Gaussian control?
Y. Tian, K. Zhang, R. Tedrake, and S. Sra
Learning for Dynamics & Control (L4DC) (Oral), 2023.
What is a good metric to study generalization of minimax learners?
A. Ozdaglar*, S. Pattathil*, J. Zhang*, and K. Zhang*
Neural Info. Process. Systems (NeurIPS), 2022. (Oral (4 out of all submissions) in New Frontiers in Adversarial Machine Learning Workshop, ICML 2022)
Independent policy gradient for largescale Markov potential games: Sharper rates, function approximation, and gameagnostic convergence
D. Ding*, C. Wei*, K. Zhang*, and M. Jovanović
Intl. Conf. on Machine Learning (ICML) (Long Oral), 2022.
Do differentiable simulators give better policy gradients?
H. T. Suh, M. Simchowitz, K. Zhang, and R. Tedrake
Intl. Conf. on Machine Learning (ICML) (Outstanding Paper Award), 2022.
Decentralized QLearning in zerosum Markov games
M. O. Sayin*, K. Zhang*, D. Leslie, T. Başar, and A. Ozdaglar
Neural Info. Process. Systems (NeurIPS), 2021.
Nearoptimal modelfree reinforcement learning in nonstationary episodic MDPs
Z. Qin, K. Zhang, Y. Chen, J. Chen, and C. Fan
Intl. Conf. on Learning Represent. (ICLR), 2021.
Modelbased multiagent RL in zerosum Markov games with nearoptimal sample complexity
K. Zhang, S.M. Kakade, T. Başar, and L.F. Yang
Neural Info. Process. Systems (NeurIPS) (Spotlight), 2020. (Long version accepted to JMLR)
Natural policy gradient primaldual method for constrained Markov decision processes
D. Ding, K. Zhang, T. Başar, and M.R. Jovanovic
Neural Info. Process. Systems (NeurIPS), 2020.
Policy optimization for H2 linear control with Hinfinity robustness guarantee: Implicit regularization and global convergence
K. Zhang, B. Hu, and T. Başar
Learning for Dynamics & Control (L4DC) (Oral), 2020. (Long version accepted to SICON)
Fully decentralized multiagent reinforcement learning with networked agents
K. Zhang, Z. Yang, H. Liu, T. Zhang, and T. Başar
Intl. Conf. on Machine Learning (ICML), 2018.
