Medical School Office Building (MSOB)
Rm x303

DATE: March 15, 2018
TIME: 1:30 - 2:50 pm
TITLE: Efficient Use of EHR for Biomedical Translational Research
Tianxi Cai
Professor of Biostatistics, Harvard


While clinical trials remain a critical source for studying disease risk, progression and treatment response, they have limitations including the generalizability of the study findings to the real world and the limited ability to test broader hypotheses. In recent years, due to the increasing adoption of electronic health records (EHR) and the linkage of EHR with specimen bio-repositories, large integrated EHR datasets now exist as a new source for translational research. These datasets open new opportunities for deriving real-word, data-driven prediction models of disease risk and progression as well as unbiased investigation of shared genetic etiology of multiple phenotypes. Yet, they also bring methodological challenges. For example, obtaining validated phenotype information, such as presence of a disease condition and treatment response, is a major bottleneck in EHR research, as it requires laborious medical record review. A valuable type of EHR data is narrative free-text data. Extracting accurate yet concise information from the narrative data via natural language processing is also challenging.  In this talk, I’ll discuss various statistical and informatics methods that illustrate both opportunities and challenges. These methods will be illustrated using EHR data from Partner’s Healthcare.