Genome Technology Center

A small number of genes are sufficient to resolve many different end point using gene expression analysis

Georges Natsoulis
Iconix Pharmaceuticals

We have assembled a very large toxicogenomic database. Rats were treated with more than 600 drugs in multiple dose, multiple times and in biological triplicate.  Gene expression profiles were collected from up to seven different tissues. More than 200 hematology, clinical chemistry, histopathology and pharmacology assays were performed in the same animals. We systematically mined the gene expression domain of this dataset using an SVM based two-class supervised classification method. More than 300 thoroughly cross-validated linear classifiers (signatures), each composed of an average of 45 genes, were identified. We verified that these signatures resolve distinct and uncorrelated end-points.  Some genes recur in a large number of signatures. The occurrence of genes across signatures follows a power law distribution. These genes are therefore forming a scale free network. We can show that the hubs of that network (as few as 400 genes in a given tissue) are sufficient to recreate all signatures with no appreciable loss in classification performance. This finding opens the possibility of creating a multi-endpoint diagnostic device.

Biography information:
Dr. Georges Natsoulis is currently Senior Director of Advanced Technology in Iconix Pharmaceuticals where he has developed the signature algorithms and the methodology for the systematic mining of the Iconix microarray database. He has held various positions over the last 10 years in four different biotechnology companies. Prior to that he trained in Molecular Biology and Genetics in MIT, the Whitehead Institute (1983-1986) and Johns Hopkins Medical School (1987-1993).

Footer Links: