Clinical Focus

  • Pediatric Radiology

Academic Appointments

Professional Education

  • Board Certification: Diagnostic Radiology, American Board of Radiology (2006)
  • Fellowship:Vanderbilt University- Monroe Carell Children's Hospital at Vanderbilt (2007) TNUnited States of America
  • Fellowship:LPCH (2007) CAUnited States of America
  • Residency:Medical University of South Carolina (2006) SCUnited States of America
  • Residency:Drexel/Hahnemann University (2004) PAUnited States of America
  • Internship:Frankford Hospitals (now Aria Health) (2002) PAUnited States of America
  • Medical Education:University of New England College of Osteopathic Medicine (2001) MEUnited States of America
  • Board Certification: Pediatric Radiology, American Board of Radiology (2008)


All Publications

  • Deep learning to automate Brasfield chest radiographic scoring for cystic fibrosis. Journal of cystic fibrosis : official journal of the European Cystic Fibrosis Society Zucker, E. J., Barnes, Z. A., Lungren, M. P., Shpanskaya, Y., Seekins, J. M., Halabi, S. S., Larson, D. B. 2019


    BACKGROUND: The aim of this study was to evaluate the hypothesis that a deep convolutional neural network (DCNN) model could facilitate automated Brasfield scoring of chest radiographs (CXRs) for patients with cystic fibrosis (CF), performing similarly to a pediatric radiologist.METHODS: All frontal/lateral chest radiographs (2058 exams) performed in CF patients at a single institution from January 2008-2018 were retrospectively identified, and ground-truth Brasfield scoring performed by a board-certified pediatric radiologist. 1858 exams (90.3%) were used to train and validate the DCNN model, while 200 exams (9.7%) were reserved for a test set. Five board-certified pediatric radiologists independently scored the test set according to the Brasfield method. DCNN model vs. radiologist performance was compared using Spearman correlation (rho) as well as mean difference (MD), mean absolute difference (MAD), and root mean squared error (RMSE) estimation.RESULTS: For the total Brasfield score, rho for the model-derived results computed pairwise with each radiologist's scores ranged from 0.79-0.83, compared to 0.85-0.90 for radiologist vs. radiologist scores. The MD between model estimates of the total Brasfield score and the average score of radiologists was -0.09. Based on MD, MAD, and RMSE, the model matched or exceeded radiologist performance for all subfeatures except air-trapping and large lesions.CONCLUSIONS: A DCNN model is promising for predicting CF Brasfield scores with accuracy similar to that of a pediatric radiologist.

    View details for PubMedID 31056440

  • Human-machine partnership with artificial intelligence for chest radiograph diagnosis. NPJ digital medicine Patel, B. N., Rosenberg, L., Willcox, G., Baltaxe, D., Lyons, M., Irvin, J., Rajpurkar, P., Amrhein, T., Gupta, R., Halabi, S., Langlotz, C., Lo, E., Mammarappallil, J., Mariano, A. J., Riley, G., Seekins, J., Shen, L., Zucker, E., Lungren, M. 2019; 2: 111


    Human-in-the-loop (HITL) AI may enable an ideal symbiosis of human experts and AI models, harnessing the advantages of both while at the same time overcoming their respective limitations. The purpose of this study was to investigate a novel collective intelligence technology designed to amplify the diagnostic accuracy of networked human groups by forming real-time systems modeled on biological swarms. Using small groups of radiologists, the swarm-based technology was applied to the diagnosis of pneumonia on chest radiographs and compared against human experts alone, as well as two state-of-the-art deep learning AI models. Our work demonstrates that both the swarm-based technology and deep-learning technology achieved superior diagnostic accuracy than the human experts alone. Our work further demonstrates that when used in combination, the swarm-based technology and deep-learning technology outperformed either method alone. The superior diagnostic accuracy of the combined HITL AI solution compared to radiologists and AI alone has broad implications for the surging clinical AI deployment and implementation strategies in future practice.

    View details for DOI 10.1038/s41746-019-0189-7

    View details for PubMedID 31754637

    View details for PubMedCentralID PMC6861262

  • CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison Irvin, J., Rajpurkar, P., Ko, M., Yu, Y., Ciurea-Ilcus, S., Chute, C., Marklund, H., Haghgoo, B., Ball, R., Shpanskaya, K., Seekins, J., Mong, D. A., Halabi, S. S., Sandberg, J. K., Jones, R., Larson, D. B., Langlotz, C. P., Patel, B. N., Lungren, M. P., Ng, A. Y., AAAI ASSOC ADVANCEMENT ARTIFICIAL INTELLIGENCE. 2019: 590?97
  • Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists PLOS MEDICINE Rajpurkar, P., Irvin, J., Ball, R. L., Zhu, K., Yang, B., Mehta, H., Duan, T., Ding, D., Bagul, A., Langlotz, C. P., Patel, B. N., Yeom, K. W., Shpanskaya, K., Blankenberg, F. G., Seekins, J., Amrhein, T. J., Mong, D. A., Halabi, S. S., Zucker, E. J., Ng, A. Y., Lungren, M. P. 2018; 15 (11)
  • Deep learning for chest radiograph diagnosis: A retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS medicine Rajpurkar, P., Irvin, J., Ball, R. L., Zhu, K., Yang, B., Mehta, H., Duan, T., Ding, D., Bagul, A., Langlotz, C. P., Patel, B. N., Yeom, K. W., Shpanskaya, K., Blankenberg, F. G., Seekins, J., Amrhein, T. J., Mong, D. A., Halabi, S. S., Zucker, E. J., Ng, A. Y., Lungren, M. P. 2018; 15 (11): e1002686


    BACKGROUND: Chest radiograph interpretation is critical for the detection of thoracic diseases, including tuberculosis and lung cancer, which affect millions of people worldwide each year. This time-consuming task typically requires expert radiologists to read the images, leading to fatigue-based diagnostic error and lack of diagnostic expertise in areas of the world where radiologists are not available. Recently, deep learning approaches have been able to achieve expert-level performance in medical image interpretation tasks, powered by large network architectures and fueled by the emergence of large labeled datasets. The purpose of this study is to investigate the performance of a deep learning algorithm on the detection of pathologies in chest radiographs compared with practicing radiologists.METHODS AND FINDINGS: We developed CheXNeXt, a convolutional neural network to concurrently detect the presence of 14 different pathologies, including pneumonia, pleural effusion, pulmonary masses, and nodules in frontal-view chest radiographs. CheXNeXt was trained and internally validated on the ChestX-ray8 dataset, with a held-out validation set consisting of 420 images, sampled to contain at least 50 cases of each of the original pathology labels. On this validation set, the majority vote of a panel of 3 board-certified cardiothoracic specialist radiologists served as reference standard. We compared CheXNeXt's discriminative performance on the validation set to the performance of 9 radiologists using the area under the receiver operating characteristic curve (AUC). The radiologists included 6 board-certified radiologists (average experience 12 years, range 4-28 years) and 3 senior radiology residents, from 3 academic institutions. We found that CheXNeXt achieved radiologist-level performance on 11 pathologies and did not achieve radiologist-level performance on 3 pathologies. The radiologists achieved statistically significantly higher AUC performance on cardiomegaly, emphysema, and hiatal hernia, with AUCs of 0.888 (95% confidence interval [CI] 0.863-0.910), 0.911 (95% CI 0.866-0.947), and 0.985 (95% CI 0.974-0.991), respectively, whereas CheXNeXt's AUCs were 0.831 (95% CI 0.790-0.870), 0.704 (95% CI 0.567-0.833), and 0.851 (95% CI 0.785-0.909), respectively. CheXNeXt performed better than radiologists in detecting atelectasis, with an AUC of 0.862 (95% CI 0.825-0.895), statistically significantly higher than radiologists' AUC of 0.808 (95% CI 0.777-0.838); there were no statistically significant differences in AUCs for the other 10 pathologies. The average time to interpret the 420 images in the validation set was substantially longer for the radiologists (240 minutes) than for CheXNeXt (1.5 minutes). The main limitations of our study are that neither CheXNeXt nor the radiologists were permitted to use patient history or review prior examinations and that evaluation was limited to a dataset from a single institution.CONCLUSIONS: In this study, we developed and validated a deep learning algorithm that classified clinically important abnormalities in chest radiographs at a performance level comparable to practicing radiologists. Once tested prospectively in clinical settings, the algorithm could have the potential to expand patient access to chest radiograph diagnostics.

    View details for PubMedID 30457988

Footer Links:

Stanford Medicine Resources: