The Han Lab

Lab Overview

Our research program focuses on developing and applying novel statistical methods for understanding the genetic and environmental etiology of complex diseases and establishing new approaches for evaluating effective screening strategies based on etiological understanding. While several screening strategies have been proposed to identify high-risk individuals for cancer, many of these screening programs still rely on basic clinical or demographic factors (such as age and smoking history for lung cancer or age for breast cancer). However, mounting evidence suggests that personalized screening based on risk prediction models can enhance efficiency in the early detection of disease, resulting in mortality reduction by utilizing comprehensive genetic, environmental, and clinical risk factors.

As faculty of the Quantitative Sciences Unit (QSU) in the Biomedical Informatics Research Division of the Department of Medicine and in the Department of Neurosurgery at Stanford, Dr. Summer Han's research program combines the areas of health policy modeling and statistical genetics to advance cancer screening and early detection research based on the funded NIH grant (NCI R37 MERIT Award for Early-Stage Investigator, PI: Han). In addition, Dr. Han's team has developed a collaborative research program within Neurosurgery to establish novel statistical methods and applications for the purpose of advancing neuroscience.

The areas of our research interests include statistical genetics, molecular epidemiology, cancer screening, health policy modeling, risk prediction modeling, and machine learning approaches for time-to-event outcomes. We have developed various statistical methods to analyze high-dimensional data to identify genetic and environmental risk factors and their interactions for complex diseases. These approaches include employing a unified framework that integrates a class of disease risk models for modeling the joint effects of genes and environmental exposures and using a set of constraints that are biologically plausible in order to increase the power of tests and to reduce false positives.  

Department of Neurosurgery

The Han Lab is part of the Department of Neurosurgery.

Lab News

Can Risk Model-Based Screening Reduce Racial Disparities in the U.S.?

(10/26/2023) Our latest article on lung cancer screening has been published in JAMA Oncology! In this study, we validated and recalibrated the PLCOm2012 model—a well-established risk prediction model based on a predominantly White population—across races and ethnicities in the US and evaluated racial and ethnic disparities and screening performance through risk-based screening using PLCOm2012 vs. the national lung cancer screening guidelines. The findings of this cohort study suggest that risk-based lung cancer screening can reduce racial and ethnic disparities and improve screening performance across races and ethnicities vs. the current national lung cancer screening guidelines.

Methods for Dynamic Risk Prediction Modeling Published

(9/6/2023) The paper led by Anya and Eunji was published in the International Journal of Epidemiology.  In this study, we developed a general and flexible framework for dynamic risk prediction using a landmarking approach for survival data under competing risks. This is implemented in an R package, dynamicLM, which covers the entire pipeline for data preparation, model development, dynamic prediction, model evaluation, and visualization.  The published software can be applied in many clinical settings, such as predicting a cancer recurrence (or a therapy response) using time-varying biomarkers (i.e., circulating tumor DNA) or predicting second malignancies using updated patients’ treatment histories. 

Dr. Han Received the Teaching Award in Biomedical Informatics

(8/16/2023) Dr. Summer Han received the 2022 Department of Medicine Teaching Award in Biomedical Informatics Research (BMIR). She received this award for her outstanding training, mentoring, and teaching of collaborative methods to post-doctoral trainees, graduate students (in epidemiology, biomedical informatics, medicine, and engineering), clinical fellows, and staff members in the Quantitative Sciences Unit. The 2022 Award Ceremony at Medical Grand Rounds took place at 8 a.m. on Wednesday, August 16, 2023, in Berg Hall (Li Ka Shing Center).

Graph approximation-based information fusion: applications in multi-omics cancer subtyping

(5/1/2023) Aparajita's prior work, "Approximate Graph Laplacians for Multimodal Data Clustering" published in IEEE Transactions on Pattern Analysis and Machine Intelligence" was selected for Indian Science Congress Association Young Scientists Award Programme 2023 under the Section of Information and Communication Science & Technology. The work introduces a novel method of eigenvector approximation of a graph to de-noise and integrate multi-source information. This machine learning approach of information fusion holds wide-ranging applications, including multi-omics tumor subtype identification, social network community detection, and image analysis.

The Han Lab Awarded New R01 funding from the National Cancer Institute

(7/1/2023). Dr. Han received new, five-year R01 funding from the National Cancer Institute (1R01CA282793, PI Han), titled "Integrating Multiple Electronic Health Records Systems to Improve Lung Cancer Outcomes". This project aims to develop a shared database for lung cancer (i.e., Oncoshare-Lung) by integrating multiple electronic health record systems from community-based and academic healthcare systems. It also aims to provide a set of clinical decision tools for efficiently managing lung cancer survivors by developing a novel statistical framework and applying a robust causal inference method to evaluate efficient screening criteria for lung cancer survivors. 

The Han Lab Is Hiring a Postdoctoral Fellow for Biostatistics

(2/17/2023) Applications are invited for a postdoctoral fellow position to join Dr. Summer Han’s research group in the Stanford Center for Biomedical Informatics Research at Stanford University. This position emphasizes developing and applying novel statistical methods for analyzing large electronic health records (EHR) data for patients with cancer or neurological diseases (e.g., Alzheimer’s disease). Specific areas of interest include but are not limited to (1) machine learning methods for analyzing time-to-event outcomes under competing risks, (2) dynamic risk modeling for high-dimensional survival data using longitudinal features, (3) addressing selection and representativeness bias in EHRs; (4) causal inference methods for handling bias due to dependent censoring or competing risks; (5) missing data imputation for longitudinal EHR data, (6) target trial emulation methods using EHRs. Please click here for more information.

Risk Model-Based Lung Cancer Screening More Cost-Effective Than USPSTF Recs

(2/7/2023) The collaborative work with the CISNET Lung Group has been published in the Annals of Internal Medicine. This work demonstrates that risk model-based lung cancer screening is more cost-effective than the national lung cancer screening guidelines recommended by the U.S. Preventive Services Task Force (USPSTF). Please see the media coverage by MedPage featuring this finding.

Predictive Model to Guide Brain Magnetic Resonance Imaging Surveillance in Patients With Metastatic Lung Cancer: Impact on Real-World Outcomes

(10/17/2022) The paper led by Julie and Vicki was published in JCO Precision Oncology! We developed a machine learning–based clinico-genomic prediction model to estimate patient-level brain metastasis risk using data on tumor characteristics, treatment history, demographics, and tumor sequencing from a broad-based next-generation sequencing panel. The web-based tool for predicting the risk of brain metastasis among metastatic lung cancer patients called Risk Assessment for Metastasis to Brain Outcome (RAMBO) is also available now.

Graph Based Approach for Cell Type Annotation from Single-Cell RNA-seq Data

(8/25/2022) Aparajita's poster titled "Graph Based Approach for Cell Type Annotation from Single-Cell RNA-seq Data" was selected for Reviewers’ Choice Recipient at ASHG 2022. She will also present this poster at the Stanford Bio-X Interdisciplinary Initiatives Seed Grants Poster Session this Friday (August 26, 2022) at Clark Center Courtyard. Please stop by if you're around!

QSU Investigator Summer Han Collaborates with Clinicians to Attack Second Primary Lung Cancer

As a primary investigator running her own lab, Dr. Han relies on the multidisciplinary strengths of biostatisticians, epidemiologists, and medical doctors on her team as she pursues studies with practical ramifications. As an example, Han had developed a mathematical model that she wanted to convert into a web-based risk assessment tool to aid clinical decision making for lung cancer patients and survivors. Results to date led to the publication of Development and Validation of a Risk Prediction Model for Second Primary Lung Cancer in the July 13, 2021 issue of the Journal of the National Cancer Institute.

Sophia Luo received the Best Student-Contributed Poster award in JSM 2021

(August 13, 2021) Sophia Luo (Undergraduate and Co-Term master student in Clinical Epidemiology) received the Best Student-Contributed Poster award from the American Statistical Association's Statistical Consulting Section in Joint Statistical Meetings 2021. The presentation is titled Statistical Collaboration for Competing Risk Analysis: Smoking cessation and the risk of Second Primary Lung Cancer. In this study, she showed that the Kaplan-Meier estimator overestimates the risk of second primary lung cancer (SPLC) compared to the Aalen-Johansen estimator due to the severe competing risk of death being present. The study also showed that smoking cessation following the initial diagnosis of lung cancer significantly reduces the risk of developing subsequent malignancies in the lungs. 

Second Primary Lung Cancer  Risk Assessment Tool (SPLC-RAT) Available

(July 29, 2021) SPLC-RAT is a web-based risk prediction tool for second primary lung cancer (SPLC) among lung cancer patients. By simply entering patients’ demographic data, tumor characteristics, and smoking history, the estimates for 5-year and 10-year risks of developing SPLC after initial diagnosis of lung cancer can be provided.

Risk Prediction Model for Second Primary Lung Cancer

(July 25, 2021) We developed a prediction model for second primary lung cancer (SPLC) risk among ever-smoking lung cancer patients that integrates various risk factors for SPLC, utilizing large population-based cohort data. The proposed model was built using various competing risk modeling approaches and validated using two external populations that are heterogeneous with regards to smoking history and race/ethnicity. This study provides the first effort to incorporate comprehensive risk factors including smoking information, medical history, treatment, and tumor characteristics in predicting SPLC risk using large population-based data. We implemented the proposed model into a user-friendly web-based tool, SPLC-RAT that can help the decision-making of patients and clinicians for guiding surveillance and screening for IPLC cases. This work has been published in the Journal of National Cancer Institute.

Postdocs/Research Assistant for Bioinformatics

July 22, 2021: Applications are invited for postdoctoral fellow positions to join Dr. Summer Han’s research group in the Stanford Center for Biomedical Informatics Research at Stanford University. This position emphasizes developing and applying statistical and machine learning methods for analyzing genomic data (RNA-seq, DNA methylation, and single-cell RNA-seq data) and for building prediction models for various cancers using time-to-event outcomes. 

LDCT Screening for Lung Cancer Reduces Risk of Brain Metastases

July 6, 2021: We leveraged on the data from a large, population-based study – the National Lung Screening Trial (NLST) – to identify a cohort of lung cancer patients with information on metastasis after cancer diagnosis. We found that patients whose lung cancers were diagnosed by screening with Low-Dose CT imaging were at a significantly reduced risk for brain metastasis compared to those who were diagnosed by other methods (including X-ray screening or clinical symptoms). Our further analyses among those detected in early stages (Stage I-III) and those who underwent surgery for lung cancer treatment show that this reduction in risk may not be fully explained by early detection or curative surgery. To the best of our knowledge, our study is the first effort showing subsequent impact of lung cancer screening on metastasis to other organs. This work further affirms the importance of screening for lung cancer.

The work has been featured in the press on EurekaAlert! by the American Association for the Advancement of Science (AAAS), on Medical Xpress, as well as on Twitter by various major academic organizations including National Cancer Institute (NCI), The International Association for the Study of Lung Cancer (IASLC), Journal of Thoracic Oncology (JTO).

Gene-Environment Interaction using an Empirical Bayes-Type Estimator

May 2, 2021: We developed a novel statistical method to evaluate gene-gene or gene-environment interaction using genome-wide association study (GWAS) data. This effort provides a robust test for additive interaction under the trend effect of genotype, applying an empirical Bayes-type shrinkage estimator of the relative excess risk due to interaction. The application of the proposed method to examine SNP by APOE*4 interaction for late onset Alzheimer’s disease demonstrated a substantial improvement in controlling false positives over the existing method when the assumption on gene-gene independence was violated. To the best of our knowledge, the proposed method is the first approach that evaluates additive gene-gene or gene-environment interaction under the trend effect of genotype by combining the traditional prospective likelihood-based and the retrospective likelihood-based estimators to relax the strict gene-environment independence assumption.  This method is implemented in the CGEN bioconductor package.

NCI R01 funded for the project, "Evaluation of genetic, clinical, and environmental risk factors to establish effective screening strategies for second primary lung cancer"  (PI: Han, 1R01CA226081) (2018)

Our work on the prediction modeling for second primary lung cancer published in the Journal Clinical Oncology (2017)

We developed a novel statistical method to identify genetic associations by integrating a class of disease risk models for the joint effects of genetic and environmental risk factors--appeared in Biometrics (2015)