**Title:** Overcoming the Long Horizon Barrier for Sample-Efficient Reinforcement Learning with Latent Low-Rank Structure

**Speaker: **Christina Lee Yu**Date and Time: **02/24/2022 4:10PM ET**Location:** Phillips 233 and Zoom

**Abstract**: The practicality of reinforcement learning algorithms has been limited due to poor scaling with respect to the problem size, as the sample complexity of learning an epsilon-optimal policy scales with |S| times |A| over worst case instances of an MDP with state space S, action space A, and horizon H. A key question is when can we design probably efficient RL algorithms that exploit nice structure? We consider a class of MDPs that exhibit low rank structure, where the latent features are unknown. We argue that a natural combination of value iteration and low-rank matrix estimation results in an estimation error that grows doubly exponentially in the horizon length. We then provide a new algorithm along with statistical guarantees that efficiently exploits low rank structure given access to a generative model, achieving a sample complexity that scales linearly in |S|+|A| and polynomially in the horizon and the rank. In contrast to literature on linear and low-rank MDPs, we do not require a known feature mapping, our algorithm is computationally simple, and our results hold for long time horizons. Our results provide insights on the minimal low-rank structural assumptions required on the MDP with respect to the transition kernel versus the optimal action-value function.

**Bio**: Christina Lee Yu is an Assistant Professor at Cornell University in the School of Operations Research and Information Engineering. Prior to Cornell, she was a postdoc at Microsoft Research New England. She received her PhD and MS in Electrical Engineering and Computer Science from Massachusetts Institute of Technology, and she received her BS in Computer Science from California Institute of Technology. She received honorable mention for the 2018 INFORMS Dantzig Dissertation Award, and she is a recipient of the 2021 Intel Rising Stars Award and 2021 JPMorgan Faculty Research Award. Her research interests include algorithm design and analysis, high dimensional statistics, inference over networks, sequential decision making under uncertainty, online learning, and network causal inference.