CDS Special Seminar
Robustness is a fundamental concept in systems science and engineering. It is a critical consideration in all inference and decision making problems, both single agent and multi-agent ones. It has surfaced again in recent years in the context of machine learning (ML), reinforcement learning (RL) and artificial intelligence (AI). We describe a novel and unifying theory of robustness for all these problems emanating form the fundamental results we obtained in my research group some 25 year ago on robust output feedback control for general systems (including nonlinear, HMM and set-valued). In the first part of this lecture I will summarize this theory and the universal solution it provides consisting of two coupled HJB equations. These results are a sweeping generalization of the transformational control theory results on the linear quadratic gaussian problem obtained by Jacobson, Speyer, Doyle, Glover, Khargonekar, Francis, in the 1970's and 1980's. Our results rigorously established the equivalence of three seemingly unrelated problems: the robust output feedback control problem, a partially observed differential game, and a partially observed risk sensitive stochastic control problem. In the second part of this lecture I will start by showing the "four block" view of this problem and show for the first time a similar formulation of the so-called robust (or adversarially robust) ML problem. Thus we have a rigorous path to analyze robustness and attack resiliency in ML. I will show several examples. I will also describe how using an exponential criterion in deep learning explains the convergence of stochastic gradients despite over-parametrization (Poggio 2020). Then I will describe our most recent results on robust and risk sensitive reinforcement learning (RL). Here the emergence of exponential of an integral criterion from our earlier theory is essential. We show how all forms of regularized RL can be derived from our theory, including KL and Entropy regularization, relation to probabilistic graphical models, distributional robustness. The deeper reason for this unification emerges: it is the fundamental tradeoff between performance optimization and risk minimization in decision making, via duality. This connects to Prospect Theory. I will close with open problems and future research.