Gautam Kunapuli's Publications |
|
Journal Publications
|
J. Hu, J. E. Mitchell, J.-S. Pang, K. P. Bennett and G. Kunapuli.
On the Global Solution of Linear Programs with Linear Complementarity Constraints.
SIAM Journal on Optimization. Volume 19, Number 1 (2008), pp. 445 - 471.
[pdf, preprint]
[BibTeX]
[Abstract]
This paper presents a parameter-free integer-programming based algorithm for the global
resolution of a linear program with linear complementarity constraints (LPEC). The cornerstone
of the algorithm is a minimax integer program formulation that characterizes and provides
certificates for the three outcomes---infeasibility, unboundedness, or solvability---of an LPEC.
An extreme point/ray generation scheme in the spirit of Benders decomposition is developed, from
which valid inequalities in the form of satisfiability constraints are obtained. The feasibility
problem of these inequalities and the carefully guided linear programming relaxations of the LPEC
are the workhorse of the algorithm, which also employs a specialized procedure for the sparsification
of the satisfiability cuts. We establish the finite termination of the algorithm and report computational
results using the algorithm for solving randomly generated LPECs of reasonable sizes. The results
establish that the algorithm can handle infeasible, unbounded, and solvable LPECs effectively.
G. Kunapuli, K. P. Bennett, J. Hu and J.-S. Pang.
Classification Model Selection via Bilevel Programming.
Optimization Methods and Software. Volume 23, Issue 4 (2008), pp. 475 - 489
Special Issue on Mathematical Programming in Data Mining and Machine Learning,
Guest Editors: Katya Scheinberg and Jiming Peng
[pdf, preprint]
[BibTeX]
[Abstract]
Cross validation is a well-known and widely used method used for model
selection which involves searching a discretized grid for the combination
of model hyper-parameters that minimizes the out-of-sample validation
error — an estimate of the generalization error. This grid-search procedure
effectively limits the size of the parameter space since many convex
optimization problems must be solved at each grid point to evaluate its
effectiveness. This paper proposes a novel formulation of cross-validation
as a bilevel program which can systematically search the continuous hyper-parameter
space. Also discussed are computational methods for solving a bilevel cross-validation
program and numerical results that demonstrate the practicability of this
approach for model selection in machine learning.
|
|
Book Chapters
|
K. P. Bennett, G. Kunapuli, J. Hu and J.-S. Pang.
Bilevel Optimization and Machine Learning.
Computational Intelligence: Research Frontiers.
Lecture Notes in Computer Science.
Volume 5050 (2008), pp 25 - 47.
IEEE World Congress on Computational Intelligence, WCCI 2008, Hong Kong, China,
June 1-6, 2008, Plenary/Invited Lectures.
[pdf, preprint]
[BibTeX]
[Abstract]
We examine the interplay of optimization and machine learning.
Great progress has been made in machine learning by cleverly
reducing machine learning problems to convex optimization problems
with one or more hyper-parameters. The availability of powerful
convex programming theory and algorithms has enabled a flood of new
research in machine learning models and methods. But many of the
steps necessary for successful machine learning models fall outside
of the convex machine learning paradigm. Thus we now propose framing
machine learning problems as Stackelberg games. The resulting
bilevel optimization problem allows for efficient systematic search
of large numbers of hyper-parameters. We discuss recent progress in
solving these bilevel problems and the many interesting optimization
challenges that remain. Finally, we investigate the intriguing
possibility of novel machine learning models enabled by bilevel
programming.
G. Kunapuli, K. P. Bennett, J. Hu and J.-S. Pang.
Bilevel Model Selection for Support Vector Machines.
CRM Proceedings and Lecture Notes. Volume 45 (2008), pp 129-158.
American Mathematical Society. Pierre Hansen and Panos Pardolos, Editors.
[pdf, preprint]
[BibTeX]
[Abstract]
The successful application of Support Vector Machines (SVMs), kernel methods and other
statistical machine learning methods requires selection of model parameters based on estimates of
the generalization error. This paper presents a novel approach to systematic model selection through
bilevel optimization. We show how modelling tasks for widely used machine learning methods can
be formulated as bilevel optimization problems and describe how the approach can address a broad
range of tasks — among which are parameter, feature and kernel selection. In addition, we also discuss
the challenges in implementing these approaches and enumerate opportunities for future work in this
emerging research area.
|
|
Conference Publications
|
G. Kunapuli, K. P. Bennett, R. Maclin and J. W. Shavlik
Online Passive-Aggressive Algorithms With Advice.
Neural Information Processing Systems, 2009 (submitted).
[pdf]
[BibTeX]
[Abstract]
We propose a novel approach for incorporating prior knowledge into the SVM
classification problem in an online setting. Once the learning algorithm predicts
the label of the data point in the current round, it receives the correct label. The
goal is to update the hypothesis taking into account the label feedback and prior
knowledge, in the form of soft polyhedral advice, so as to make increasingly accurate
predictions on subsequent rounds. Advice helps speed up and bias learning so
that generalization can be obtained with less data. A passive-aggressive approach
updates the hypothesis using a hybrid loss that takes into account the margins of
both the hypothesis and the advice on the current point. Encouraging computational
results and loss bounds are provided.
S. Natarajan, P. Tadepalli, G. Kunapuli and J. W. Shavlik
Learning Parameters for Relational Probabilistic Models with Noisy-Or Combining Rule.
Uncertainty in Artificial Intelligence, 2009 (submitted).
[pdf]
[BibTeX]
[Abstract]
Languages that combine predicate logic with
probabilities are needed to succinctly represent knowledge in many real-world domains.
We consider a formalism based on universally
quantified conditional influence statements
that capture local interactions between object attributes. The effects of different
conditional influence statements can be combined
using combining rules such as Noisy-OR. To
combine multiple instantiations of the same
rule we need other combining rules at a lower
level. In this paper we derive and implement algorithms based on gradient-descent
and EM for learning the parameters of these
multi-level combining rules. We compare our
approaches to learning in Markov Logic networks and show superior performance in
multiple domains.
S. Natarajan, G. Kunapuli, C. O' Reilly, R. Maclin, T. Walker, D. Page and J. W. Shavlik
ILP for Bootstrapped Learning: A Layered Approach to Automating the ILP Setup Problem.
International Conference on Inductive Logic Programming, 2009
[pdf]
[ILP '09]
[BibTeX]
[Abstract]
This paper introduces a new type of application for ILP called Bootstrapped Learning (BL).
BL brings several challenges to ILP, including the need to automate the “ILP Set-Up”
problem; small numbers of examples, in some cases no negative examples; and the need to “
bootstrap” or to automatically base learning in part on the results of earlier learning
sessions. The paper introduces BL, discusses the general challenges it raises for ILP,
and presents initial results in ILP for BL.
K. P. Bennett, X. Ji, J. Hu, G. Kunapuli and J.-S. Pang.
Model selection via bilevel programming.
Proceedings of the 2006 IEEE World Congress on Computational Intelligence,
Vancouver, B.C. Canada, July 16 - 21, 2006.
[pdf]
[BibTeX]
[Abstract]
A key step in many statistical learning methods
used in machine learning involves solving a convex optimization
problem containing one or more hyper-parameters that must
be selected by the users. While cross validation is a commonly
employed and widely accepted method for selecting these
parameters, its implementation by a grid-search procedure in
the parameter space effectively limits the desirable number
of hyper-parameters in a model, due to the combinatorial
explosion of grid points in high dimensions. This paper proposes
a novel bilevel optimization approach to cross validation
that provides a systematic search of the hyper-parameters.
The bilevel approach enables the use of the state-of-the-art
optimization methods and their well-supported softwares. After
introducing the bilevel programming approach, we discuss
computational methods for solving a bilevel cross-validation
program, and present numerical results to substantiate the
viability of this novel approach as a promising computational
tool for model selection in machine learning.
|
|
Workshops & Symposia
|
S. Natarajan, P. Tadepalli, G. Kunapuli and J. W. Shavlik.
Knowledge Intensive Learning: Directed vs. Undirected SRL Models.
International Workshop in SRL at Leuven, Belgium, July 2009.
G. Kunapuli, K. P. Bennett, R. Maclin and J. W. Shavlik.
Online Learning With Knowledge-Based SVMs.
The Learning Workshop at Clearwater Beach, FL, April 2009.
[pdf, poster]
[Snowbird '09]
|
|
Dissertation
|
G. Kunapuli.
A Bilevel Optimization Approach to Machine Learning.
A Thesis Submitted to the Graduate Faculty of Rensselaer
Polytechnic Institute in Partial Fulfillment of the Requirements for
the Degree of Doctor of Philosophy in Mathematics. February 2008.
[pdf]
[RPI-pdf]
[BibTeX]
[Abstract]
A key step in many statistical learning methods used in
machine learning involves solving a convex optimization problem
containing one or more hyper-parameters that must be selected by the
users. While cross validation is a commonly employed and widely
accepted method for selecting these parameters, its implementation
by a grid-search procedure in the parameter space effectively limits
the desirable number of hyper-parameters in a model, due to the
combinatorial explosion of grid points in high dimensions. A novel
paradigm based on bilevel optimization approach is proposed and
gives rise to a unifying framework within which issues such as model
selection can be addressed.
The machine learning problem is formulated as a bilevel
program--a mathematical program that has constraints which are
functions of optimal solutions of another mathematical program
called the inner-level program. The bilevel program is transformed
to an equivalent mathematical program with equilibrium constraints
(MPEC). Two alternative bilevel optimization algorithms are
developed to optimize the MPEC and provide a systematic search of
the hyper-parameters.
In the first approach, the equilibrium constraints of the
MPEC are relaxed to form a nonlinear program with linear objective
and non-convex quadratic inequality constraints, which is then
solved using a general purpose nonlinear programming solver. In the
second approach, the equilibrium constraints are treated as penalty
terms in the objective, and the resulting non-convex quadratic
program with linear constraints is solved using a successive
linearization algorithm.
The flexibility of the bilevel approach to deal with
multiple hyper-parameters, makes it powerful approach to problems
such as parameter and feature selection (model selection). In this
thesis, three problems are studied: model selection for support
vector (SV) classification, model selection for SV regression and
missing value-imputation for SV regression. Extensive computational
results establish that both algorithmic approaches find solutions
that generalize as well or better than conventional approaches and
are much more computationally efficient.
|