Network Knowledge Base Analysis of High-Throughput Genomic Data

With the ever-increasing wealth of datasets from high-throughput analysis such as gene expression microarray, there is an urgent need to develop methods for data pruning, integration, reasoning and systematic generation of biological insights on disease mechanisms and treatments. We believe our project can address this need.

In this project we focus on developing new statistical algorithms for integrative analysis of high-throughput experimental data and a knowledge base of previously published and verified scientific findings. This development will empower us to elucidate important pathways in diseases and generate novel insight into treatments.


  • We observed that high-throughput genomic data unavoidably suffer from false-negative and false-positive finds, and experimental data alone are not sufficient for understanding the multiple layered biological mechanisms of complex diseases. Genomic studies of clinical settings are often underpowered.
  • Integrative analysis of multiple datasets in the context of genome-wide interaction networks, i.e. the observed interactome of human, will help us to cross-validate experimental results and elucidate important pathways in diseases and generate novel insight into treatments.


  • To achieve our goal of the project and address the scientific challenges stated above, we have formed a team with Professor David Donoho at the Stanford Statistics Department and Ingenuity Systems Inc.
  • We have applied this integrative analysis to the microarray data of human inflammation and to identify candidates of causal interactions and pathways of significance in this clinical process.


Wenzhong Xiao 
Dan Richards 
David Donoho 
Ronald W. Davis

Statistics Dept, Stanford 
Ingenuity Systems, Inc.