Peeking into a black box, the fairness and generalizability of a MIMIC-III benchmarking model
2022; 9 (1): 24
As artificial intelligence (AI) makes continuous progress to improve quality of care for some patients by leveraging ever increasing amounts of digital health data, others are left behind. Empirical evaluation studies are required to keep biased AI models from reinforcing systemic health disparities faced by minority populations through dangerous feedback loops. The aim of this study is to raise broad awareness of the pervasive challenges around bias and fairness in risk prediction models. We performed a case study on a MIMIC-trained benchmarking model using a broadly applicable fairness and generalizability assessment framework. While open-science benchmarks are crucial to overcome many study limitations today, this case study revealed a strong class imbalance problem as well as fairness concerns for Black and publicly insured ICU patients. Therefore, we advocate for the widespread use of comprehensive fairness and performance assessment frameworks to effectively monitor and validate benchmark pipelines built on open data resources.
View details for DOI 10.1038/s41597-021-01110-7
View details for Web of Science ID 000746595100001
View details for PubMedID 35075160
- Digital twins for predictive oncology will be a paradigm shift for precision cancer care. Nature medicine 2021 Hide More
Assessment of a Clinical Trial-Derived Survival Model in Patients With Metastatic Castration-Resistant Prostate Cancer.
JAMA network open
2021; 4 (1): e2031730
Randomized clinical trials (RCTs) are considered the criterion standard for clinical evidence. Despite their many benefits, RCTs have limitations, such as costliness, that may reduce the generalizability of their findings among diverse populations and routine care settings.To assess the performance of an RCT-derived prognostic model that predicts survival among patients with metastatic castration-resistant prostate cancer (CRPC) when the model is applied to real-world data from electronic health records (EHRs).The RCT-trained model and patient data from the RCTs were obtained from the Dialogue for Reverse Engineering Assessments and Methods (DREAM) challenge for prostate cancer, which occurred from March 16 to July 27, 2015. This challenge included 4 phase 3 clinical trials of patients with metastatic CRPC. Real-world data were obtained from the EHRs of a tertiary care academic medical center that includes a comprehensive cancer center. In this study, the DREAM challenge RCT-trained model was applied to real-world data from January 1, 2008, to December 31, 2019; the model was then retrained using EHR data with optimized feature selection. Patients with metastatic CRPC were divided into RCT and EHR cohorts based on data source. Data were analyzed from March 23, 2018, to October 22, 2020.Patients who received treatment for metastatic CRPC.The primary outcome was the performance of an RCT-derived prognostic model that predicts survival among patients with metastatic CRPC when the model is applied to real-world data. Model performance was compared using 10-fold cross-validation according to time-dependent integrated area under the curve (iAUC) statistics.Among 2113 participants with metastatic CRPC, 1600 participants were included in the RCT cohort, and 513 participants were included in the EHR cohort. The RCT cohort comprised a larger proportion of White participants (1390 patients [86.9%] vs 337 patients [65.7%]) and a smaller proportion of Hispanic participants (14 patients [0.9%] vs 42 patients [8.2%]), Asian participants (41 patients [2.6%] vs 88 patients [17.2%]), and participants older than 75 years (388 patients [24.3%] vs 191 patients [37.2%]) compared with the EHR cohort. Participants in the RCT cohort also had fewer comorbidities (mean [SD], 1.6 [1.8] comorbidities vs 2.5 [2.6] comorbidities, respectively) compared with those in the EHR cohort. Of the 101 variables used in the RCT-derived model, 10 were not available in the EHR data set, 3 of which were among the top 10 features in the DREAM challenge RCT model. The best-performing EHR-trained model included only 25 of the 101 variables included in the RCT-trained model. The performance of the RCT-trained and EHR-trained models was adequate in the EHR cohort (mean [SD] iAUC, 0.722 [0.118] and 0.762 [0.106], respectively); model optimization was associated with improved performance of the best-performing EHR model (mean [SD] iAUC, 0.792 [0.097]). The EHR-trained model classified 256 patients as having a high risk of mortality and 256 patients as having a low risk of mortality (hazard ratio, 2.7; 95% CI, 2.0-3.7; log-rank P < .001).In this study, although the RCT-trained models did not perform well when applied to real-world EHR data, retraining the models using real-world EHR data and optimizing variable selection was beneficial for model performance. As clinical evidence evolves to include more real-world data, both industry and academia will likely search for ways to balance model optimization with generalizability. This study provides a pragmatic approach to applying RCT-trained models to real-world data.
View details for DOI 10.1001/jamanetworkopen.2020.31730
View details for PubMedID 33481032
Machine Learning Applied to Electronic Health Records: Identification of Chemotherapy Patients at High Risk for Preventable Emergency Department Visits and Hospital Admissions.
JCO clinical cancer informatics
2021; 5: 1106-1126
Acute care use (ACU) is a major driver of oncologic costs and is penalized by a Centers for Medicare & Medicaid Services quality measure, OP-35. Targeted interventions reduce preventable ACU; however, identifying which patients might benefit remains challenging. Prior predictive models have made use of a limited subset of the data in the electronic health record (EHR). We aimed to predict risk of preventable ACU after starting chemotherapy using machine learning (ML) algorithms trained on comprehensive EHR data.Chemotherapy patients treated at an academic institution and affiliated community care sites between January 2013 and July 2019 who met inclusion criteria for OP-35 were identified. Preventable ACU was defined using OP-35 criteria. Structured EHR data generated before chemotherapy treatment were obtained. ML models were trained to predict risk for ACU after starting chemotherapy using 80% of the cohort. The remaining 20% were used to test model performance by the area under the receiver operator curve.Eight thousand four hundred thirty-nine patients were included, of whom 35% had preventable ACU within 180 days of starting chemotherapy. Our primary model classified patients at risk for preventable ACU with an area under the receiver operator curve of 0.783 (95% CI, 0.761 to 0.806). Performance was better for identifying admissions than emergency department visits. Key variables included prior hospitalizations, cancer stage, race, laboratory values, and a diagnosis of depression. Analyses showed limited benefit from including patient-reported outcome data and indicated inequities in outcomes and risk modeling for Black and Medicaid patients.Dense EHR data can identify patients at risk for ACU using ML with promising accuracy. These models have potential to improve cancer care outcomes, patient experience, and costs by allowing for targeted, preventative interventions.
View details for DOI 10.1200/CCI.21.00116
View details for PubMedID 34752139
M.S., Stanford University, Health Services Research (2013)
Ph.D., University Claude Bernard, Lyon 1, Computational Biology (1999)
M.P.H., Yale University, Epidemiology (1993)
B.A., University California, Irvine, Psychology (1991)
B.S., University of California, Irvine, Biology (1991)