People: Principal Investigator

Associate Professor of Medicine (Biomedical Informatics), of Biomedical Data Science, of Surgery and, by courtesy, of Epidemiology and Population Health


Dr. Hernandez-Boussard is an Associate Professor at Stanford University in Medicine (Biomedical Informatics), Biomedical Data Sciences, Surgery and Epidemiology & Population Health (by courtesy). Her background and expertise are in the field of biomedical informatics, health services research, and epidemiology. In her current work, Dr. Hernandez-Boussard develops and evaluates AI technology to accurately and efficiently monitor, measure, and predict healthcare outcomes. She has developed the infrastructure to efficiently capture heterogenous data sources, transform these diverse data to knowledge, and use this knowledge to improve patient outcomes, healthcare delivery, and guide policy.


  • Learning from Past Respiratory Failure Patients to Triage COVID-19 Patient Ventilator Needs: A Multi-Institutional Study. Journal of biomedical informatics Carmichael, H., Coquet, J., Sun, R., Sang, S., Groat, D., Asch, S. M., Bledsoe, J., Peltan, I. D., Jacobs, J. R., Hernandez-Boussard, T. 2021: 103802


    BACKGROUND: Unlike well-established diseases that base clinical care on randomized trials, past experiences, and training, prognosis in COVID19 relies on a weaker foundation. Knowledge from other respiratory failure diseases may inform clinical decisions in this novel disease. The objective was to predict 48-hour invasive mechanical ventilation (IMV) within 48 hours in patients hospitalized with COVID-19 using COVID-like diseases (CLD).METHODS: This retrospective multicenter study trained machine learning (ML) models on patients hospitalized with CLD to predict IMV within 48 hours in COVID-19 patients. CLD patients were identified using diagnosis codes for bacterial pneumonia, viral pneumonia, influenza, unspecified pneumonia and acute respiratory distress syndrome (ARDS), 2008-2019. A total of 16 cohorts were constructed, including any combinations of the four diseases plus an exploratory ARDS cohort, to determine the most appropriate cohort to use. Candidate predictors included demographic and clinical parameters that were previously associated with poor COVID-19 outcomes. Model development included the implementation of logistic regression and three ensemble tree-based algorithms: decision tree, AdaBoost, and XGBoost. Models were validated in hospitalized COVID-19 patients at two healthcare systems, March 2020-July 2020. ML models were trained on CLD patients at Stanford Hospital Alliance (SHA). Models were validated on hospitalized COVID-19 patients at both SHA and Intermountain Healthcare.RESULTS: CLD training data were obtained from SHA (n=14,030), and validation data included 444 adult COVID-19 hospitalized patients from SHA (n=185) and Intermountain (n=259). XGBoost was the top-performing ML model, and among the 16 CLD training cohorts, the best model achieved an area under curve (AUC) of 0.883 in the validation set. In COVID-19 patients, the prediction models exhibited moderate discrimination performance, with the best models achieving an AUC of 0.77 at SHA and 0.65 at Intermountain. The model trained on all pneumonia and influenza cohorts had the best overall performance (SHA: positive predictive value (PPV) 0.29, negative predictive value (NPV) 0.97, positive likelihood ratio (PLR) 10.7; Intermountain: PPV, 0.23, NPV 0.97, PLR 10.3). We identified important factors associated with IMV that are not traditionally considered for respiratory diseases.CONCLUSIONS: The performance of prediction models derived from CLD for 48-hour IMV in patients hospitalized with COVID-19 demonstrate high specificity and can be used as a triage tool at point of care. Novel predictors of IMV identified in COVID-19 are often overlooked in clinical practice. Lessons learned from our approach may assist other research institutes seeking to build artificial intelligence technologies for novel or rare diseases with limited data for training and validation.

    View details for DOI 10.1016/j.jbi.2021.103802

    View details for PubMedID 33965640

  • Assessment of a Clinical Trial-Derived Survival Model in Patients With Metastatic Castration-Resistant Prostate Cancer. JAMA network open Coquet, J. n., Bievre, N. n., Billaut, V. n., Seneviratne, M. n., Magnani, C. J., Bozkurt, S. n., Brooks, J. D., Hernandez-Boussard, T. n. 2021; 4 (1): e2031730


    Randomized clinical trials (RCTs) are considered the criterion standard for clinical evidence. Despite their many benefits, RCTs have limitations, such as costliness, that may reduce the generalizability of their findings among diverse populations and routine care settings.To assess the performance of an RCT-derived prognostic model that predicts survival among patients with metastatic castration-resistant prostate cancer (CRPC) when the model is applied to real-world data from electronic health records (EHRs).The RCT-trained model and patient data from the RCTs were obtained from the Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge for prostate cancer, which occurred from March 16 to July 27, 2015. This challenge included 4 phase 3 clinical trials of patients with metastatic CRPC. Real-world data were obtained from the EHRs of a tertiary care academic medical center that includes a comprehensive cancer center. In this study, the DREAM challenge RCT-trained model was applied to real-world data from January 1, 2008, to December 31, 2019; the model was then retrained using EHR data with optimized feature selection. Patients with metastatic CRPC were divided into RCT and EHR cohorts based on data source. Data were analyzed from March 23, 2018, to October 22, 2020.Patients who received treatment for metastatic CRPC.The primary outcome was the performance of an RCT-derived prognostic model that predicts survival among patients with metastatic CRPC when the model is applied to real-world data. Model performance was compared using 10-fold cross-validation according to time-dependent integrated area under the curve (iAUC) statistics.Among 2113 participants with metastatic CRPC, 1600 participants were included in the RCT cohort, and 513 participants were included in the EHR cohort. The RCT cohort comprised a larger proportion of White participants (1390 patients [86.9%] vs 337 patients [65.7%]) and a smaller proportion of Hispanic participants (14 patients [0.9%] vs 42 patients [8.2%]), Asian participants (41 patients [2.6%] vs 88 patients [17.2%]), and participants older than 75 years (388 patients [24.3%] vs 191 patients [37.2%]) compared with the EHR cohort. Participants in the RCT cohort also had fewer comorbidities (mean [SD], 1.6 [1.8] comorbidities vs 2.5 [2.6] comorbidities, respectively) compared with those in the EHR cohort. Of the 101 variables used in the RCT-derived model, 10 were not available in the EHR data set, 3 of which were among the top 10 features in the DREAM challenge RCT model. The best-performing EHR-trained model included only 25 of the 101 variables included in the RCT-trained model. The performance of the RCT-trained and EHR-trained models was adequate in the EHR cohort (mean [SD] iAUC, 0.722 [0.118] and 0.762 [0.106], respectively); model optimization was associated with improved performance of the best-performing EHR model (mean [SD] iAUC, 0.792 [0.097]). The EHR-trained model classified 256 patients as having a high risk of mortality and 256 patients as having a low risk of mortality (hazard ratio, 2.7; 95% CI, 2.0-3.7; log-rank P < .001).In this study, although the RCT-trained models did not perform well when applied to real-world EHR data, retraining the models using real-world EHR data and optimizing variable selection was beneficial for model performance. As clinical evidence evolves to include more real-world data, both industry and academia will likely search for ways to balance model optimization with generalizability. This study provides a pragmatic approach to applying RCT-trained models to real-world data.

    View details for DOI 10.1001/jamanetworkopen.2020.31730

    View details for PubMedID 33481032

  • Bias at Warp Speed: How AI may Contribute to the Disparities Gap in the Time of COVID-19. Journal of the American Medical Informatics Association : JAMIA Roosli, E., Rice, B., Hernandez-Boussard, T. 2020


    The COVID-19 pandemic is presenting a disproportionate impact on minorities in terms of infection rate, hospitalizations and mortality. Many believe Artificial Intelligence (AI) is a solution to guide clinical decision making for this novel disease, resulting in the rapid dissemination of under-developed and potentially biased models, which may exacerbate the disparities gap. We believe there is an urgent need to enforce the systematic use of reporting standards and develop regulatory frameworks for a shared COVID-19 data source to address the challenges of bias in AI during this pandemic. There is hope that AI can help guide treatment decisions within this crisis yet given the pervasiveness of biases, a failure to proactively develop comprehensive mitigation strategies during the COVID-19 pandemic risks exacerbating existing health disparities.

    View details for DOI 10.1093/jamia/ocaa210

    View details for PubMedID 32805004

  • MINIMAR (MINimum Information for Medical AI Reporting): Developing reporting standards for artificial intelligence in health care. Journal of the American Medical Informatics Association : JAMIA Hernandez-Boussard, T., Bozkurt, S., Ioannidis, J. P., Shah, N. H. 2020


    The rise of digital data and computing power have contributed to significant advancements in artificial intelligence (AI), leading to the use of classification and prediction models in health care to enhance clinical decision-making for diagnosis, treatment and prognosis. However, such advances are limited by the lack of reporting standards for the data used to develop those models, the model architecture, and the model evaluation and validation processes. Here, we present MINIMAR (MINimum Information for Medical AI Reporting), a proposal describing the minimum information necessary to understand intended predictions, target populations, and hidden biases, and the ability to generalize these emerging technologies. We call for a standard to accurately and responsibly report on AI in health care. This will facilitate the design and implementation of these models and promote the development and use of associated clinical decision support tools, as well as manage concerns regarding accuracy and bias.

    View details for DOI 10.1093/jamia/ocaa088

    View details for PubMedID 32594179

Academic Appointments

Associate Professor, Medicine - Biomedical Informatics Research 

Associate Professor, Biomedical Data Science

Associate Professor, Surgery - General Surgery

Member, Stanford Cancer Institute

Professional Education

M.S., Stanford University, Health Services Research (2013)

Ph.D., University Claude Bernard, Lyon 1, Computational Biology (1999)

M.P.H., Yale University, Epidemiology (1993)

B.A., University California, Irvine, Psychology (1991)

B.S., University of California, Irvine, Biology (1991)