skip to main content

CMX Lunch Seminar

Tuesday, April 22, 2025
12:00pm to 1:00pm
Add to Cal
Annenberg 213
The Mechanism Behind the Implicit Biases of Large Learning Rates: Edge of Stability, Balancing, and Catapult
Yuqing Wang, Postdoctoral Scholar Research Associate, AMS, Johns-Hopkins University,

Large learning rates, when applied to gradient descent for nonconvex optimization, yield various implicit biases, including edge of stability, balancing, and catapult. There are a lot of theoretical works trying to analyze these phenomena, while the high level idea is still missing: it is unclear when and why these phenomena occur. In this talk, I will show that these phenomena are actually various tips of the same iceberg. They occur when the objective function of optimization has some good regularity. This regularity, together with the effect of large learning rate on guiding gradient descent from sharp regions to flatter ones, leads to the control of the largest eigenvalue of Hessian, i.e., sharpness, along the GD trajectory, which results in various phenomena. The result is based on the convergence analysis under large learning rate on a family of nonconvex functions of various regularities without Lipschitz gradient which is usually a default assumption in nonconvex optimization. Neural network experiments will also be presented to validate this result.

For more information, please contact Jolene Brink by phone at (626)395-2813 or by email at jbrink@caltech.edu or visit CMX Website.