Health Research and Policy

Abstract

DATE:

October 22, 2009

TIME:

1:15 - 3:00 pm

LOCATION:

Center for Clinical Sciences Research (CCSR), Rm 4205

TITLE:

Fast Sparse Regression and Classification

SPEAKER:

Jerome H. Friedman
Department of Statistics, Stanford

Regularized regression and classification methods fit a linear model to data, based on some loss criterion, subject to a constraint on the coefficient values. As special cases, ridge-regression, the lasso, and subset selection all use squared-error loss with different particular constraint choices. For large problems, the general choice of a particular loss criterion and/or constraint is often limited by the computation required to obtain the corresponding solution estimates. This is especially the case when non-convex constraints are employed to induce very sparse solutions. A fast algorithm is presented that produces solutions that closely approximate those for any convex loss and any constraint (convex or non-convex) that is monotone increasing in the coefficient absolute values. This extends the application of such loss-constraint combinations to very large problems. The benefits of this generality are illustrated by examples. Time permitting, application of this algorithm to the infinite dimensional setting where the predictors are all functions in a specified class, will be described. This leads to a generalized boosting algorithm.

Stanford Medicine Resources:

Footer Links: