Model Selection Via Bilevel Programming

Published in Proc. 2006 IEEE World Congress on Computational Intelligence, 2006

Recommended citation: K. P. Bennett, X. Ji, J. Hu, G. Kunapuli and J.-S. Pang. Model Selection Via Bilevel Programming Proceedings of the 2006 IEEE World Congress on Computational Intelligence, Vancouver, BC, Canada, July 16-21, 2006.

A key step in many statistical learning methods used in machine learning involves solving a convex optimization problem containing one or more hyper-parameters that must be selected by the users. While cross validation is a commonly employed and widely accepted method for selecting these parameters, its implementation by a grid-search procedure in the parameter space effectively limits the desirable number of hyper-parameters in a model, due to the combinatorial explosion of grid points in high dimensions. This paper proposes a novel bilevel optimization approach to cross validation that provides a systematic search of the hyper-parameters. The bilevel approach enables the use of the state-of-the-art optimization methods and their well-supported softwares. After introducing the bilevel programming approach, we discuss computational methods for solving a bilevel cross-validation program, and present numerical results to substantiate the viability of this novel approach as a promising computational tool for model selection in machine learning.