Workshop in Biostatistics

DATE: April 14, 2016
TIME: 1:30 - 3:00 pm
LOCATION: Medical School Office Building, Rm x303
TITLE: Finding hidden signals in whole-genome genetic and epigenetic data
SPEAKER: Eran Halperin
Associate Professor, Blavatnik School of Computer Science, Tel-Aviv University
Associate Professor, Department of Molecular Microbiology and Biotechnology, Tel-Aviv University
Senior Research Scientist, International Computer Science Institute, Berkeley, CA.



Whole-genome genetic and epigenetic data sets the promise of detecting statistical correlations between phenotypes and genetic variants or epigenetic markers via genome-wide association studies (GWAS) and epigenome-wide association studies (EWAS). These correlations are useful for the generation of new hypotheses regarding the mechanisms involved, and they can be used for disease prediction and prediction of treatment outcomes. GWAS and EWAS studies, however, are complicated by the fact that correlations between the phenotype and confounders such as age, sex, batch effects, etc., may result in a large number of false positives. I will describe different approaches that deal with these confounders by directly predicting them from the data. Specifically, I will show how one can predict sex, cell type composition and ancestry from either genotype or methylation data, using different variations of principal components analysis. These variations utilize the specific nature of each of the data types, resulting in a better performance than standard PCA. I will demonstrate how these approaches can be useful in specific studies of whole-genome genetic and epigenetic data.

Suggested readings:

  1. Elior Rahmani, et al. Sparse PCA Corrects for Cell-Type Heterogeneity in Epigenome-Wide Association Studies, Nature Methods, in press (should be on Nature's website on March 28).
  2. Paula Singmann, Doron Shem-Tov, Simone Wahl, et al. Characterization of whole-genome autosomal differences of DNA methylation between men and women, Epigenetics & Chromatin, 2015, 8:43 (19 October 2015).
  3. Baran, Yael, and Eran Halperin.  A Note on the Relations Between Spatio-Genetic Models, Journal of Computational Biology, October 2015, 22(10): 905-917, 2015.
  4. Baran Yael, Quintela I, Carracedo A, Pasaniuc B, Halperin E.  Enhanced Localization of Genetic Samples through Linkage-Disequilibrium Correction, Am J of Hum Genetics, 2013, 6;92(6):882-94.
  5. Yang WY, Novembre J, Eskin E, Halperin E.  A model-based approach for analysis of spatial structure in genetic data, Nature Genetics, 20;44(6):725-31, 2012.