Amar K. Das
Publication Details
-
Unsupervised method for automatic construction of a disease dictionary from a large free text collection.
AMIA Annu Symp Proc. 2008: 820-4
Concept specific lexicons (e.g. diseases, drugs, anatomy) are a critical source of background knowledge for many medical language-processing systems. However, the rapid pace of biomedical research and the lack of constraints on usage ensure that such dictionaries are incomplete. Focusing on disease terminology, we have developed an automated, unsupervised, iterative pattern learning approach for constructing a comprehensive medical dictionary of disease terms from randomized clinical trial (RCT) abstracts, and we compared different ranking methods for automatically extracting con-textual patterns and concept terms. When used to identify disease concepts from 100 randomly chosen, manually annotated clinical abstracts, our disease dictionary shows significant performance improvement (F1 increased by 35-88%) over available, manually created disease terminologies.
