People: Principal Investigator
Assessment of a Clinical Trial-Derived Survival Model in Patients With Metastatic Castration-Resistant Prostate Cancer.
JAMA network open
2021; 4 (1): e2031730
Randomized clinical trials (RCTs) are considered the criterion standard for clinical evidence. Despite their many benefits, RCTs have limitations, such as costliness, that may reduce the generalizability of their findings among diverse populations and routine care settings.To assess the performance of an RCT-derived prognostic model that predicts survival among patients with metastatic castration-resistant prostate cancer (CRPC) when the model is applied to real-world data from electronic health records (EHRs).The RCT-trained model and patient data from the RCTs were obtained from the Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge for prostate cancer, which occurred from March 16 to July 27, 2015. This challenge included 4 phase 3 clinical trials of patients with metastatic CRPC. Real-world data were obtained from the EHRs of a tertiary care academic medical center that includes a comprehensive cancer center. In this study, the DREAM challenge RCT-trained model was applied to real-world data from January 1, 2008, to December 31, 2019; the model was then retrained using EHR data with optimized feature selection. Patients with metastatic CRPC were divided into RCT and EHR cohorts based on data source. Data were analyzed from March 23, 2018, to October 22, 2020.Patients who received treatment for metastatic CRPC.The primary outcome was the performance of an RCT-derived prognostic model that predicts survival among patients with metastatic CRPC when the model is applied to real-world data. Model performance was compared using 10-fold cross-validation according to time-dependent integrated area under the curve (iAUC) statistics.Among 2113 participants with metastatic CRPC, 1600 participants were included in the RCT cohort, and 513 participants were included in the EHR cohort. The RCT cohort comprised a larger proportion of White participants (1390 patients [86.9%] vs 337 patients [65.7%]) and a smaller proportion of Hispanic participants (14 patients [0.9%] vs 42 patients [8.2%]), Asian participants (41 patients [2.6%] vs 88 patients [17.2%]), and participants older than 75 years (388 patients [24.3%] vs 191 patients [37.2%]) compared with the EHR cohort. Participants in the RCT cohort also had fewer comorbidities (mean [SD], 1.6 [1.8] comorbidities vs 2.5 [2.6] comorbidities, respectively) compared with those in the EHR cohort. Of the 101 variables used in the RCT-derived model, 10 were not available in the EHR data set, 3 of which were among the top 10 features in the DREAM challenge RCT model. The best-performing EHR-trained model included only 25 of the 101 variables included in the RCT-trained model. The performance of the RCT-trained and EHR-trained models was adequate in the EHR cohort (mean [SD] iAUC, 0.722 [0.118] and 0.762 [0.106], respectively); model optimization was associated with improved performance of the best-performing EHR model (mean [SD] iAUC, 0.792 [0.097]). The EHR-trained model classified 256 patients as having a high risk of mortality and 256 patients as having a low risk of mortality (hazard ratio, 2.7; 95% CI, 2.0-3.7; log-rank P < .001).In this study, although the RCT-trained models did not perform well when applied to real-world EHR data, retraining the models using real-world EHR data and optimizing variable selection was beneficial for model performance. As clinical evidence evolves to include more real-world data, both industry and academia will likely search for ways to balance model optimization with generalizability. This study provides a pragmatic approach to applying RCT-trained models to real-world data.
View details for DOI 10.1001/jamanetworkopen.2020.31730
View details for PubMedID 33481032
Learning from past respiratory infections to predict COVID-19 Outcomes: A retrospective study.
Journal of medical Internet research
In the clinical care of well-established diseases, randomized trials, literature and research are supplemented by clinical judgment to understand disease prognosis and inform treatment choices. In the void created by a lack of clinical experience with COVID-19, Artificial Intelligence (AI) may be an important tool to bolster clinical judgment and decision making. However, lack of clinical data restricts the design and development of such AI tools, particularly in preparation of an impending crisis or pandemic.This study aimed to develop and test the feasibility of a 'patients-like-me' framework to predict COVID-19 patient deterioration using a retrospective cohort of similar respiratory diseases.Our framework used COVID-like cohorts to design and train AI models that were then validated on the COVID-19 population. The COVID-like cohorts included patients diagnosed with bacterial pneumonia, viral pneumonia, unspecified pneumonia, influenza, and acute respiratory distress syndrome (ARDS) from an academic medical center, 2008-2019. Fifteen training cohorts were created using different combinations of the COVID-like cohorts with the ARDS cohort for exploratory purpose. Two machine learning (ML) models were developed, one to predict invasive mechanical ventilation (IMV) within 48 hours for each hospitalized day, and one to predict all-cause mortality at the time of admission. Model performance was assessed using the area under the receiver operating characteristic curve (AUC), sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). We established model interpretability by calculating SHapley Additive exPlanations (SHAP) scores to identify important features.Compared to the COVID-like cohorts (n=16,509), the COVID-19 hospitalized patients (n=159) were significantly younger, with a higher proportion of Hispanic ethnicity, lower proportion of smoking history and fewer comorbidities (P <0.001). COVID-19 patients had a lower IMV rate (15.1 vs 23.2, P=0.016) and shorter time to IMV (2.9 vs 4.1, P <0.001) compared to the COVID-like patients. In the COVID-like training data, the top models achieved excellent performance (AUV > 0.90). Validating in the COVID-19 cohort, the best performing model of predicting IMV was the XGBoost model (AUC: 0.826) trained on the viral pneumonia cohort. Similarly, the XGBoost model trained on all four COVID-like cohorts without ARDS achieved the best performance (AUC: 0.928) in predicting mortality. Important predictors included demographic information (age), vital signs (oxygen saturation), and laboratory values (white blood count, cardiac troponin, albumin, etc.). Our models suffered from class imbalance, that resulted in high negative predictive values and low positive predictive values.We provided a feasible framework for modeling patient deterioration using existing data and AI technology to address data limitations during the onset of a novel, rapidly changing pandemic.
View details for DOI 10.2196/23026
View details for PubMedID 33534724
Bias at Warp Speed: How AI may Contribute to the Disparities Gap in the Time of COVID-19.
Journal of the American Medical Informatics Association : JAMIA
The COVID-19 pandemic is presenting a disproportionate impact on minorities in terms of infection rate, hospitalizations and mortality. Many believe Artificial Intelligence (AI) is a solution to guide clinical decision making for this novel disease, resulting in the rapid dissemination of under-developed and potentially biased models, which may exacerbate the disparities gap. We believe there is an urgent need to enforce the systematic use of reporting standards and develop regulatory frameworks for a shared COVID-19 data source to address the challenges of bias in AI during this pandemic. There is hope that AI can help guide treatment decisions within this crisis yet given the pervasiveness of biases, a failure to proactively develop comprehensive mitigation strategies during the COVID-19 pandemic risks exacerbating existing health disparities.
View details for DOI 10.1093/jamia/ocaa210
View details for PubMedID 32805004
MINIMAR (MINimum Information for Medical AI Reporting): Developing reporting standards for artificial intelligence in health care.
Journal of the American Medical Informatics Association : JAMIA
The rise of digital data and computing power have contributed to significant advancements in artificial intelligence (AI), leading to the use of classification and prediction models in health care to enhance clinical decision-making for diagnosis, treatment and prognosis. However, such advances are limited by the lack of reporting standards for the data used to develop those models, the model architecture, and the model evaluation and validation processes. Here, we present MINIMAR (MINimum Information for Medical AI Reporting), a proposal describing the minimum information necessary to understand intended predictions, target populations, and hidden biases, and the ability to generalize these emerging technologies. We call for a standard to accurately and responsibly report on AI in health care. This will facilitate the design and implementation of these models and promote the development and use of associated clinical decision support tools, as well as manage concerns regarding accuracy and bias.
View details for DOI 10.1093/jamia/ocaa088
View details for PubMedID 32594179
M.S., Stanford University, Health Services Research (2013)
Ph.D., University Claude Bernard, Lyon 1, Computational Biology (1999)
M.P.H., Yale University, Epidemiology (1993)
B.A., University California, Irvine, Psychology (1991)
B.S., University of California, Irvine, Biology (1991)