报告题目1:Sequential minimization methods for bilevel programs
报告人1: 张进教授,曾尚志博士后 南方科技大学
报告时间: 2023年3月11日下午2:30-3:30
报告地点:龙洞校区行政楼610
报告摘要:In recent years, bilevel optimization has received extensive attentions from machine learning community. Gradient-based bilevel optimization methods have been widely applied to solve modern machine learning problems. However, the theoretical correctness and practical effectiveness of these existing approaches always rely on some restrictive conditions, which could hardly be satisfied in real-world applications. In this talk, we will discuss how the sequential minimization idea opens the way for establishing new gradient-based bilevel optimization methods to partially address the above issues.
个人简介:
张进, 2007、2010年本科、硕士毕业于大连理工大学,2014年博士毕业于加拿大维多利亚大学。2015至2018年间任职香港浸会大学数学系,2019年初加入南方科技大学数学系。张进博士一直致力于最优化理论和应用研究,主持多项国家级基金项目,代表性成果发表在《数学规划》,《美国数学会优化杂志》、《美国数学会数值分析杂志》、《机器学习研究》等有重要影响力的应用数学、机器学习期刊与会议上。2023年获得国家自然科学基金优青项目,2022年获得广东省科技厅青年科技创新奖,广东省自然科学基金杰青项目,2021年获得深圳市优秀科技创新人才培养优青项目,2020年获得中国运筹学会青年科技奖。
曾尚志, 加拿大维多利亚大学PIMS博士后。2021年博士毕业于香港大学。主要研究方向是变分分析及双层规划。代表性成果发表在Math Program、SIAM J Numer undefined、J Mach Learn Res、IEEE Trans Pattern undefined Mach Intell,以及ICML、NeurIPS等具有重要影响力的数值最优化、计算数学、机器学习期刊与会议上。
报告题目2:Averaged Method of Multipliers for Bi-Level Optimization without Lower-Level Strong Convexity
报告人2: 尧伟研究员 南方科技大学
报告时间: 2023年3月11日下午3:30-4:30
报告地点:龙洞校区行政楼610
报告摘要:Gradient methods have become mainstream techniques for Bi-Level Optimization (BLO) in learning fields. The validity of existing works heavily rely on either a restrictive Lower- Level Strong Convexity (LLSC) condition or on solving a series of approximation subproblems with high accuracy or both. In this work, by averaging the upper and lower level objectives, we propose a single loop Bi-level Averaged Method of Multipliers (sl-BAMM) for BLO that is simple yet efficient for large-scale BLO and gets rid of the limited LLSC restriction. We further provide non-asymptotic convergence analysis of sl-BAMM towards KKT stationary points, and the comparative advantage of our analysis lies in the absence of strong gradient boundedness assumption, which is always required by others. Thus our theory safely captures a wider variety of applications in deep learning, especially where the upper-level objective is quadratic w.r.t. the lower-level variable. Experimental results demonstrate the superiority of our method.
个人简介:
尧伟,博士毕业于香港中文大学,任职南方科技大学数学系和深圳国家应用数学中心研究助理教授,主要研究方向包括双层规划算法和理论分析,及其在机器学习和机制设计领域的应用。代表性论文发表在SIAM J Optim、Calculus of Variations and Partial Differential Equations、Journal of Differential Equations等运筹优化、偏微分方程领域的国际著名期刊上。