Multi-scale data fusion

The Gevaert lab focuses on multi-scale data fusion in oncology: the development of machine learning methods for biomedical decision support using multi-scale biomedical data. Previously I pioneered data fusion work using Bayesian and kernel methods studying breast and ovarian cancer. My subsequent work concerned the development of methods for multi-omics data fusion. This resulted in the development of MethylMix, to identify differentially methylated genes, and AMARETTO, a computational method to integrate DNA methylation, copy number and gene expression data to identify cancer modules. Additionally, my lab focuses on linking molecular data with cellular and tissue-level phenotypes. This led to key contributions in the field of imaging genomics/radiogenomics involving work in lung cancer and brain tumors. Our work in imaging genomics is focused on developing a framework for non-invasive personalized medicine. In summary, my lab has an interdisciplinary focus on developing novel algorithms for multi-scale biomedical data fusion. 

Example projects

We focus on many projects within the space of multi-omics data fusion, here is a range:

Cancer epigenomics

Aberrant DNA methylation is an important mechanism that contributes to oncogenesis. Yet few algorithms exist that exploit this vast dataset to identify hypo and hypermethylated genes in cancer. We developed a novel computational algorithm called MethylMix to identify differentially methylation genes that are also predictive of transcription. We applied MethylMix on twelve individual cancer sites and combining all cancer sites in a pancancer analysis. We discovered pancancer hyper and hypomethylated genes and identified novel methylation driven subgroups with clinical implications. MethylMix analysis on all cancer sites combined revealed ten pancancer clusters reflecting new similarities across malignantly transformed tissues. 

Deep learning for brain tumor segmentation

Improved methods for characterizing tumors both radiologically and histologically are essential for identifying prognostic biomarkers to guide clinical decisions. We developed an algorithm using convolutional neural networks (CNNs) to segment tumors and classify specific regions of interest.  By generalizing CNNs to true 3-D convolutions and using a unique architecture to decouple pixels and expand effective data size, our method achieves a median Dice score accuracy of over 90% in whole tumor glioblastoma segmentation, a significant improvement over past algorithms.  This result demonstrates the power of our approach in generalizing low-bias methods like CNNs to learn from medium-size medical data sets.


  • GBM study published in Science Translational Medicine

    Our work on identifying three subgroups of glioblastoma using their imaging phenotype apprears in Science Translational Medicine here.

  • CoINCiDE featured by Science Translational Medicine

    We developedCoINcIDE, a meta-analysis framework for unsupervised analysis of gene expression data cohorts for diseases.

  • Work featured in Lancet Oncology

    The Lancet Oncology featured a news article on Haruka's recent publication in Science translational Medicine.

  • MethylMix results presented at the 4th Cancer Epigenetics Conference

    Julie and Kevin's abstract on "Methylation-Driven Subtyping of Head and Neck Squamous Cell Carcinoma" was accepted for oral presentation at the 4th Epigenetics Conference in San Francisco.

  • Work featured in Nature Reviews Clinical Oncology

    Nature Reviews Clinical Oncology featured a news article on our recent publication in Science translational Medicine.