Health Research and Policy


DATE: February 6, 2014
TIME: 1:15 - 3:00 pm
LOCATION: Medical School Office Building, Rm x303
TITLE: Robust Subspace Clustering
SPEAKER: Mahdi Soltanolkotabi
Department of Electrical Engineering, Stanford

Everywhere we look the quantity of information that science, engineering and technology are producing is soaring: genomic data of patients, ratings of videos/movies/music, text and documents on the web. A key challenge is that the majority of available data in scientific and technology domains come in a raw and unstructured form that is costly to annotate (as labeling often involves a laborious experiment or some other manual effort). Nevertheless, such data often naturally cluster into a few coherent groups where the data in each group can be succinctly represented by a few representative variables. Discerning such hidden structure from noisy and corrupted data is fundamental to many data mining tasks.

This talk focuses on extracting subspace structures from data. Subspace Clustering is the problem of finding a multi-subspace representation that best fits a collection of points taken from a high-dimensional space. We will discuss how this problem arises naturally in computer vision and related applications and review standard approaches. As with most clustering problems, popular techniques for subspace clustering are often difficult to analyze theoretically and/or terminate in local optima of nonconvex functions--these problems are only exacerbated in the presence of noise and missing data. We introduce a collection of subspace clustering algorithms, which are tractable and provably robust to various forms of data imperfections. We will illustrate our methods with numerical experiments on motion capture, face clustering and motion segmentation data and some preliminary experiments on gene expression data and Flicker photos of animals.

This is joint work with Emmanuel Candes.

Suggested readings:
"Subspace Clustering", R. Vidal.

“Robust Subspace Clustering”, M. Soltanolkotabi, E. Elhamifar, and E. J. Candes. To appear in Annals of Statistics.

“A geometric analysis of subspace clustering with outliers”, M. Soltanolkotabi and E. J. Candes. Annals of Statistics 40(4), 2195--2238, 2012.

“Identifying Subspace Gene Clusters from Microarray Data Using Low-Rank Representation”, Y. Cui, C. Zheng, and J. Yang.

Footer Links: