Workshop in Biostatistics

PLEASE NOTE CHANGE OF LOCATION FOR WINTER QUARTER:

M112 Alway Building, Medical Center
(next to the Dean's courtyard)

 

DATE: February 23, 2017
TIME: 1:30 - 2:50 pm
TITLE: A graph framework for multi-scale reproducibility and differential analysis of 3D chromatin contact maps
SPEAKER: Oana Ursu
Graduate Student (5th year)
Department of Genetics, Stanford

 

Abstract:

Oana Ursu, Nathan Boley, Maryna Taranova, Rachel Wang, Anshul Kundaje.

Deciphering the three-dimensional organization of the genome is critical to understanding the role of long-range chromatin contacts in gene regulation and disease. Chromosome conformation capture techniques such as HiC, Capture-C and ChIA-PET are commonly used to experimentally measure chromatin 3D contact maps. However, the hierarchical, multi-scale organization of genomes into large-scale compartments, topologically-associated domains, looping interactions and dynamic contacts makes comparisons of contact maps challenging. Here, we overcome this with a statistical framework that leverages graph diffusion to obtain 1) a novel measure of reproducibility and 2) a statistical test for differential contacts at multiple scales and with high power. We represent contact maps as graphs and use random walks on the graph to smooth the data while maintaining high resolution of contacts and boundaries of domains. This is critical as contact maps are sparse, leading to apparent changes in contacts that are the result of sampling noise and dropout, a phenomenon which is alleviated with our smoothing scheme. We develop a novel multi-scale concordance measure to assess reproducibility of contacts for random walks of increasing length. We calibrate our reproducibility scores on simulated data comparing datasets against noise-injected versions of these, and also benchmark our scoring scheme on a variety of HiC datasets, recapitulating differences between technical replicates, biological replicates, and different cell types. Our framework generalizes seamlessly to contact maps from other assays such as ChIA-PET and CaptureC. Finally, we derive a statistical test for differential analysis of contact maps. We identify differential contacts at multiple scales simultaneously, where the scales are defined naturally as clusters of genomic regions with similar changes in contact profiles. In addition to considering multiple scales, this test aggregates information across multiple neighboring genomic regions, substantially increasing statistical power to detect differences, as well as producing more robust results compared to existing methods. We quantify changes in contact maps between cell types, as well as quantify allele-specific contact differences. These are supported by orthogonal evidence from changes in histone marks, chromatin accessibility and regulatory factors defining boundary elements, suggesting that our method identifies statistically significant and biologically meaningful differences in 3D genome organization across different contexts.

Suggested readings:

Biology background: A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping (https://www.ncbi.nlm.nih.gov/pubmed/25497547)

Methods background: Diffusion kernels (http://people.cs.uchicago.edu/~risi/papers/KondorVert04.pdf)