Clinical Focus

  • Quantitative Imaging
  • Diagnostic Radiology
  • Biomedical informatics
  • Imaging informatics
  • Radiology

Academic Appointments

Honors & Awards

  • caBIG Connecting Collaborators Award, National Cancer Institute (2010)
  • Certificate of Merit, Radiological Society of North America (2009)
  • Cum Laude Award, Radiological Society of North America (2008)
  • Cum Laude Award, Radiological Society of North America (2006)

Professional Education

  • Residency:Stanford University School of Medicine (90) CA
  • Internship:Stanford University School of Medicine (86) CA
  • Medical Education:Stanford University School of Medicine (6/1/85) CA
  • Residency:Stanford University Hospital (91)
  • Board Certification: Diagnostic Radiology, American Board of Radiology (1990)

Research & Scholarship

Current Research and Scholarly Interests

My research interest is imaging informatics--ways computers can work with images to leverage their rich information content and to help physicians use images to guide personalized care. Just as biology has been revolutionized by online genetic data, now clinical medicine can be transformed by mining huge image repositories and electronically correlating image data with pathology and molecular data. Work in our lab thus lies at the intersection of biomedical informatics and imaging science, and we are working in several major areas. We are developing methods to extract information and meaning from images for data mining. We are also developing statistical natural language processing methods to extract and summarize information in radiology reports and published articles. We are building resources to integrate images with related clinical and molecular data to discover novel image biomarkers of disease. Finally, we are translating these methods into practice by creating decision support applications that relate radiology findings to diagnoses and that will improve diagnostic accuracy and clinical effectiveness.

Clinical Trials

  • Perfusion CT as a Predictor of Treatment Response in Patients With Rectal Cancer Recruiting

    Recent advances in computed tomography (CT) technology have made CT perfusion imaging feasible for the assessment of tumor perfusion in solid tumors of the abdomen. CT perfusion has shown promising results in serving as a noninvasive method of predicting response to therapy in cancer patients. CT perfusion parameters have also been found to correlate with immunohistologic markers of angiogenesis in a number of solid tumors, suggesting a possible role for CT perfusion as a noninvasive biomarker of tumor angiogenesis. The goals of the investigators study are twofold: first, to determine the relationship between baseline CT perfusion characteristics of rectal cancers and their response to treatment, and second, to determine if perfusion CT can be used to subsequently monitor tumor response to treatment. The investigators hope to enroll those patients with locally advanced rectal cancer undergoing standard CT for pre-treatment planning, integrating CT perfusion imaging into the current abdomen/pelvis imaging protocol with close clinical and radiologic follow-up after treatment to determine response to therapy and time to disease progression.

    View full details


2013-14 Courses

Graduate and Fellowship Programs

  • Biomedical Informatics (Phd Program)


Journal Articles

  • Modeling Perceptual Similarity Measures in CT Images of Focal Liver Lesions JOURNAL OF DIGITAL IMAGING Faruque, J., Rubin, D. L., Beaulieu, C. F., Napel, S. 2013; 26 (4): 714-720


    Motivation: A gold standard for perceptual similarity in medical images is vital to content-based image retrieval, but inter-reader variability complicates development. Our objective was to develop a statistical model that predicts the number of readers (N) necessary to achieve acceptable levels of variability. Materials and Methods: We collected 3 radiologists' ratings of the perceptual similarity of 171 pairs of CT images of focal liver lesions rated on a 9-point scale. We modeled the readers' scores as bimodal distributions in additive Gaussian noise and estimated the distribution parameters from the scores using an expectation maximization algorithm. We (a) sampled 171 similarity scores to simulate a ground truth and (b) simulated readers by adding noise, with standard deviation between 0 and 5 for each reader. We computed the mean values of 2-50 readers' scores and calculated the agreement (AGT) between these means and the simulated ground truth, and the inter-reader agreement (IRA), using Cohen's Kappa metric. Results: IRA for the empirical data ranged from =0.41 to 0.66. For between 1.5 and 2.5, IRA between three simulated readers was comparable to agreement in the empirical data. For these values , AGT ranged from =0.81 to 0.91. As expected, AGT increased with N, ranging from =0.83 to 0.92 for N = 2 to 50, respectively, with =2. Conclusion: Our simulations demonstrated that for moderate to good IRA, excellent AGT could nonetheless be obtained. This model may be used to predict the required N to accurately evaluate similarity in arbitrary size datasets.

    View details for DOI 10.1007/s10278-012-9557-4

    View details for Web of Science ID 000322434700017

    View details for PubMedID 23254627

  • Quantitative Evaluation of Drusen on Photographs OPHTHALMOLOGY Rubin, D. L., de Sisternes, L., Kutzscher, L., Chen, Q., Leng, T., Zheng, L. L. 2013; 120 (3): 644-?

    View details for Web of Science ID 000315738200034

    View details for PubMedID 23714606

  • Informatics in Radiology Improving Clinical Work Flow through an AIM Database: A Sample Web-based Lesion Tracking Application RADIOGRAPHICS Abajian, A. C., Levy, M., Rubin, D. L. 2012; 32 (5): 1543-1552


    Quantitative assessments on images are crucial to clinical decision making, especially in cancer patients, in whom measurements of lesions are tracked over time. However, the potential value of quantitative approaches to imaging is impeded by the difficulty and time-intensive nature of compiling this information from prior studies and reporting corresponding information on current studies. The authors believe that the quantitative imaging work flow can be automated by making temporal data computationally accessible. In this article, they demonstrate the utility of the Annotation and Image Markup standard in a World Wide Web-based application that was developed to automatically summarize prior and current quantitative imaging measurements. The system calculates the Response Evaluation Criteria in Solid Tumors metric, along with several alternative indicators of cancer treatment response, by using the data stored in the annotation files. The application also allows the user to overlay the recorded metrics on the original images for visual inspection. Clinical evaluation of the system demonstrates its potential utility in accelerating the standard radiology work flow and in providing a means to evaluate alternative response metrics that are difficult to compute by hand. The system, which illustrates the utility of capturing quantitative information in a standard format and linking it to the image from which it was derived, could enhance quantitative imaging in clinical practice without adversely affecting the current work flow.

    View details for DOI 10.1148/rg.325115752

    View details for Web of Science ID 000308632900027

    View details for PubMedID 22745220

  • Automatic classification of mammography reports by BI-RADS breast tissue composition class JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION Percha, B., Nassif, H., Lipson, J., Burnside, E., Rubin, D. 2012; 19 (5): 913-916


    Because breast tissue composition partially predicts breast cancer risk, classification of mammography reports by breast tissue composition is important from both a scientific and clinical perspective. A method is presented for using the unstructured text of mammography reports to classify them into BI-RADS breast tissue composition categories. An algorithm that uses regular expressions to automatically determine BI-RADS breast tissue composition classes for unstructured mammography reports was developed. The algorithm assigns each report to a single BI-RADS composition class: 'fatty', 'fibroglandular', 'heterogeneously dense', 'dense', or 'unspecified'. We evaluated its performance on mammography reports from two different institutions. The method achieves >99% classification accuracy on a test set of reports from the Marshfield Clinic (Wisconsin) and Stanford University. Since large-scale studies of breast cancer rely heavily on breast tissue composition information, this method could facilitate this research by helping mine large datasets to correlate breast composition with other covariates.

    View details for DOI 10.1136/amiajnl-2011-000607

    View details for Web of Science ID 000307934600032

    View details for PubMedID 22291166

  • The Role of Informatics in Health Care Reform ACADEMIC RADIOLOGY Liu, Y. I., Rubin, D. L. 2012; 19 (9): 1094-1099


    Improving health care quality while simultaneously reducing cost has become a high priority of health care reform. Informatics is crucial in tackling this challenge. The American Recovery and Reinvestment Act of 2009 mandates adaptation and "meaningful use " of health information technology. In this review, we will highlight several areas in which informatics can make significant contributions, with a focus on radiology. We also discuss informatics related to the increasing imperatives of state and local regulations (such as radiation dose tracking) and quality initiatives.

    View details for DOI 10.1016/j.acra.2012.05.006

    View details for Web of Science ID 000307864300008

    View details for PubMedID 22771052

  • Quantifying the margin sharpness of lesions on radiological images for content-based image retrieval MEDICAL PHYSICS Xu, J., Nadel, S., Greenspan, H., Beaulieu, C. F., Agrawal, N., Rubin, D. 2012; 39 (9): 5405-5418


    To develop a method to quantify the margin sharpness of lesions on CT and to evaluate it in simulations and CT scans of liver and lung lesions.The authors computed two attributes of margin sharpness: the intensity difference between a lesion and its surroundings, and the sharpness of the intensity transition across the lesion boundary. These two attributes were extracted from sigmoid curves fitted along lines automatically drawn orthogonal to the lesion margin. The authors then represented the margin characteristics for each lesion by a feature vector containing histograms of these parameters. The authors created 100 simulated CT scans of lesions over a range of intensity difference and margin sharpness, and used the concordance correlation between the known parameter and the corresponding computed feature as a measure of performance. The authors also evaluated their method in 79 liver lesions (44 patients: 23 M, 21 F, mean age 61) and 58 lung nodules (57 patients: 24 M, 33 F, mean age 66). The methodology presented takes into consideration the boundary of the liver and lung during feature extraction in clinical images to ensure that the margin feature do not get contaminated by anatomy other than the normal organ surrounding the lesions. For evaluation in these clinical images, the authors created subjective independent reference standards for pairwise margin sharpness similarity in the liver and lung cohorts, and compared rank orderings of similarity used using our sharpness feature to that expected from the reference standards using mean normalized discounted cumulative gain (NDCG) over all query images. In addition, the authors compared their proposed feature with two existing techniques for lesion margin characterization using the simulated and clinical datasets. The authors also evaluated the robustness of their features against variations in delineation of the lesion margin by simulating five types of deformations of the lesion margin. Equivalence across deformations was assessed using Schuirmann's paired two one-sided tests.In simulated images, the concordance correlation between measured gradient and actual gradient was 0.994. The mean (s.d.) and standard deviation NDCG score for the retrieval of K images, K = 5, 10, and 15, were 84% (8%), 85% (7%), and 85% (7%) for CT images containing liver lesions, and 82% (7%), 84% (6%), and 85% (4%) for CT images containing lung nodules, respectively. The authors' proposed method outperformed the two existing margin characterization methods in average NDCG scores over all K, by 1.5% and 3% in datasets containing liver lesion, and 4.5% and 5% in datasets containing lung nodules. Equivalence testing showed that the authors' feature is more robust across all margin deformations (p < 0.05) than the two existing methods for margin sharpness characterization in both simulated and clinical datasets.The authors have described a new image feature to quantify the margin sharpness of lesions. It has strong correlation with known margin sharpness in simulated images and in clinical CT images containing liver lesions and lung nodules. This image feature has excellent performance for retrieving images with similar margin characteristics, suggesting potential utility, in conjunction with other lesion features, for content-based image retrieval applications.

    View details for DOI 10.1118/1.4739507

    View details for Web of Science ID 000309334500012

    View details for PubMedID 22957608

  • Prognostic PET F-18-FDG Uptake Imaging Features Are Associated with Major Oncogenomic Alterations in Patients with Resected Non-Small Cell Lung Cancer CANCER RESEARCH Nair, V. S., Gevaert, O., Davidzon, G., Napel, S., Graves, E. E., Hoang, C. D., Shrager, J. B., Quon, A., Rubin, D. L., Plevritis, S. K. 2012; 72 (15): 3725-3734


    Although 2[18F]fluoro-2-deoxy-d-glucose (FDG) uptake during positron emission tomography (PET) predicts post-surgical outcome in patients with non-small cell lung cancer (NSCLC), the biologic basis for this observation is not fully understood. Here, we analyzed 25 tumors from patients with NSCLCs to identify tumor PET-FDG uptake features associated with gene expression signatures and survival. Fourteen quantitative PET imaging features describing FDG uptake were correlated with gene expression for single genes and coexpressed gene clusters (metagenes). For each FDG uptake feature, an associated metagene signature was derived, and a prognostic model was identified in an external cohort and then tested in a validation cohort of patients with NSCLC. Four of eight single genes associated with FDG uptake (LY6E, RNF149, MCM6, and FAP) were also associated with survival. The most prognostic metagene signature was associated with a multivariate FDG uptake feature [maximum standard uptake value (SUV(max)), SUV(variance), and SUV(PCA2)], each highly associated with survival in the external [HR, 5.87; confidence interval (CI), 2.49-13.8] and validation (HR, 6.12; CI, 1.08-34.8) cohorts, respectively. Cell-cycle, proliferation, death, and self-recognition pathways were altered in this radiogenomic profile. Together, our findings suggest that leveraging tumor genomics with an expanded collection of PET-FDG imaging features may enhance our understanding of FDG uptake as an imaging biomarker beyond its association with glycolysis.

    View details for DOI 10.1158/0008-5472.CAN-11-3943

    View details for Web of Science ID 000307354100004

    View details for PubMedID 22710433

  • Non-Small Cell Lung Cancer: Identifying Prognostic Imaging Biomarkers by Leveraging Public Gene Expression Microarray Data-Methods and Preliminary Results RADIOLOGY Gevaert, O., Xu, J., Hoang, C. D., Leung, A. N., Xu, Y., Quon, A., Rubin, D. L., Napel, S., Plevritis, S. K. 2012; 264 (2): 387-396


    To identify prognostic imaging biomarkers in non-small cell lung cancer (NSCLC) by means of a radiogenomics strategy that integrates gene expression and medical images in patients for whom survival outcomes are not available by leveraging survival data in public gene expression data sets.A radiogenomics strategy for associating image features with clusters of coexpressed genes (metagenes) was defined. First, a radiogenomics correlation map is created for a pairwise association between image features and metagenes. Next, predictive models of metagenes are built in terms of image features by using sparse linear regression. Similarly, predictive models of image features are built in terms of metagenes. Finally, the prognostic significance of the predicted image features are evaluated in a public gene expression data set with survival outcomes. This radiogenomics strategy was applied to a cohort of 26 patients with NSCLC for whom gene expression and 180 image features from computed tomography (CT) and positron emission tomography (PET)/CT were available.There were 243 statistically significant pairwise correlations between image features and metagenes of NSCLC. Metagenes were predicted in terms of image features with an accuracy of 59%-83%. One hundred fourteen of 180 CT image features and the PET standardized uptake value were predicted in terms of metagenes with an accuracy of 65%-86%. When the predicted image features were mapped to a public gene expression data set with survival outcomes, tumor size, edge shape, and sharpness ranked highest for prognostic significance.This radiogenomics strategy for identifying imaging biomarkers may enable a more rapid evaluation of novel imaging modalities, thereby accelerating their translation to personalized medicine.

    View details for DOI 10.1148/radiol.12111607

    View details for Web of Science ID 000306660000010

    View details for PubMedID 22723499

  • A Comprehensive Descriptor of Shape: Method and Application to Content-Based Retrieval of Similar Appearing Lesions in Medical Images JOURNAL OF DIGITAL IMAGING Xu, J., Faruque, J., Beaulieu, C. F., Rubin, D., Napel, S. 2012; 25 (1): 121-128


    We have developed a method to quantify the shape of liver lesions in CT images and to evaluate its performance for retrieval of images with similarly-shaped lesions. We employed a machine learning method to combine several shape descriptors and defined similarity measures for a pair of shapes as a weighted combination of distances calculated based on each feature. We created a dataset of 144 simulated shapes and established several reference standards for similarity and computed the optimal weights so that the retrieval result agrees best with the reference standard. Then we evaluated our method on a clinical database consisting of 79 portal-venous-phase CT liver images, where we derived a reference standard of similarity from radiologists' visual evaluation. Normalized Discounted Cumulative Gain (NDCG) was calculated to compare this ordering with the expected ordering based on the reference standard. For the simulated lesions, the mean NDCG values ranged from 91% to 100%, indicating that our methods for combining features were very accurate in representing true similarity. For the clinical images, the mean NDCG values were still around 90%, suggesting a strong correlation between the computed similarity and the independent similarity reference derived the radiologists.

    View details for DOI 10.1007/s10278-011-9388-8

    View details for Web of Science ID 000304113400018

    View details for PubMedID 21547518

  • Automatic annotation of radiological observations in liver CT images. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium Gimenez, F., Xu, J., Liu, Y., Liu, T., Beaulieu, C., Rubin, D., Napel, S. 2012; 2012: 257-263


    We aim to predict radiological observations using computationally-derived imaging features extracted from computed tomography (CT) images. We created a dataset of 79 CT images containing liver lesions identified and annotated by a radiologist using a controlled vocabulary of 76 semantic terms. Computationally-derived features were extracted describing intensity, texture, shape, and edge sharpness. Traditional logistic regression was compared to L(1)-regularized logistic regression (LASSO) in order to predict the radiological observations using computational features. The approach was evaluated by leave one out cross-validation. Informative radiological observations such as lesion enhancement, hypervascular attenuation, and homogeneous retention were predicted well by computational features. By exploiting relationships between computational and semantic features, this approach could lead to more accurate and efficient radiology reporting.

    View details for PubMedID 23304295

  • Automated temporal tracking and segmentation of lymphoma on serial CT examinations MEDICAL PHYSICS Xu, J., Greenspan, H., Napel, S., Rubin, D. L. 2011; 38 (11): 5879-5886


    It is challenging to reproducibly measure and compare cancer lesions on numerous follow-up studies; the process is time-consuming and error-prone. In this paper, we show a method to automatically and reproducibly identify and segment abnormal lymph nodes in serial computed tomography (CT) exams.Our method leverages initial identification of enlarged (abnormal) lymph nodes in the baseline scan. We then identify an approximate region for the node in the follow-up scans using nonrigid image registration. The baseline scan is also used to locate regions of normal, non-nodal tissue surrounding the lymph node and to map them onto the follow-up scans, in order to reduce the search space to locate the lymph node on the follow-up scans. Adaptive region-growing and clustering algorithms are then used to obtain the final contours for segmentation. We applied our method to 24 distinct enlarged lymph nodes at multiple time points from 14 patients. The scan at the earlier time point was used as the baseline scan to be used in evaluating the follow-up scan, resulting in 70 total test cases (e.g., a series of scans obtained at 4 time points results in 3 test cases). For each of the 70 cases, a "reference standard" was obtained by manual segmentation by a radiologist. Assessment according to response evaluation criteria in solid tumors (RECIST) using our method agreed with RECIST assessments made using the reference standard segmentations in all test cases, and by calculating node overlap ratio and Hausdorff distance between the computer and radiologist-generated contours.Compared to the reference standard, our method made the correct RECIST assessment for all 70 cases. The average overlap ratio was 80.7?±?9.7% s.d., and the average Hausdorff distance was 3.2?±?1.8 mm s.d. The concordance correlation between automated and manual segmentations was 0.978 (95% confidence interval 0.962, 0.984). The 100% agreement in our sample between our method and the standard with regard to RECIST classification suggests that the true disagreement rate is no more than 6%.Our automated lymph node segmentation method achieves excellent overall segmentation performance and provides equivalent RECIST assessment. It potentially will be useful to streamline and improve cancer lesion measurement and tracking and to improve assessment of cancer treatment response.

    View details for DOI 10.1118/1.3643027

    View details for Web of Science ID 000296534000008

    View details for PubMedID 22047352

  • Informatics in Radiology Measuring and Improving Quality in Radiology: Meeting the Challenge with Informatics RADIOGRAPHICS Rubin, D. L. 2011; 31 (6): 1511-1527


    Quality is becoming a critical issue for radiology. Measuring and improving quality is essential not only to ensure optimum effectiveness of care and comply with increasing regulatory requirements, but also to combat current trends leading to commoditization of radiology services. A key challenge to implementing quality improvement programs is to develop methods to collect knowledge related to quality care and to deliver that knowledge to practitioners at the point of care. There are many dimensions to quality in radiology that need to be measured, monitored, and improved, including examination appropriateness, procedure protocol, accuracy of interpretation, communication of imaging results, and measuring and monitoring performance improvement in quality, safety, and efficiency. Informatics provides the key technologies that can enable radiologists to measure and improve quality. However, few institutions recognize the opportunities that informatics methods provide to improve safety and quality. The information technology infrastructure in most hospitals is limited, and they have suboptimal adoption of informatics techniques. Institutions can tackle the challenges of assessing and improving quality in radiology by means of informatics.

    View details for DOI 10.1148/rg.316105207

    View details for Web of Science ID 000295985200003

    View details for PubMedID 21997979

  • Managing Biomedical Image Metadata for Search and Retrieval of Similar Images JOURNAL OF DIGITAL IMAGING Korenblum, D., Rubin, D., Napel, S., Rodriguez, C., Beaulieu, C. 2011; 24 (4): 739-748


    Radiology images are generally disconnected from the metadata describing their contents, such as imaging observations ("semantic" metadata), which are usually described in text reports that are not directly linked to the images. We developed a system, the Biomedical Image Metadata Manager (BIMM) to (1) address the problem of managing biomedical image metadata and (2) facilitate the retrieval of similar images using semantic feature metadata. Our approach allows radiologists, researchers, and students to take advantage of the vast and growing repositories of medical image data by explicitly linking images to their associated metadata in a relational database that is globally accessible through a Web application. BIMM receives input in the form of standard-based metadata files using Web service and parses and stores the metadata in a relational database allowing efficient data query and maintenance capabilities. Upon querying BIMM for images, 2D regions of interest (ROIs) stored as metadata are automatically rendered onto preview images included in search results. The system's "match observations" function retrieves images with similar ROIs based on specific semantic features describing imaging observation characteristics (IOCs). We demonstrate that the system, using IOCs alone, can accurately retrieve images with diagnoses matching the query images, and we evaluate its performance on a set of annotated liver lesion images. BIMM has several potential applications, e.g., computer-aided detection and diagnosis, content-based image retrieval, automating medical analysis protocols, and gathering population statistics like disease prevalences. The system provides a framework for decision support systems, potentially improving their diagnostic accuracy and selection of appropriate therapies.

    View details for DOI 10.1007/s10278-010-9328-z

    View details for Web of Science ID 000292888700020

    View details for PubMedID 20844917

  • Current and Future Trends in Imaging Informatics for Oncology CANCER JOURNAL Levy, M. A., Rubin, D. L. 2011; 17 (4): 203-210


    Clinical imaging plays an essential role in cancer care and research for diagnosis, prognosis, and treatment response assessment. Major advances in imaging informatics to support medical imaging have been made during the last several decades. More recent informatics advances focus on the special needs of oncologic imaging, yet gaps still remain. We review the current state, limitations, and future trends in imaging informatics for oncology care including clinical and clinical research systems. We review information systems to support cancer clinical workflows including oncologist ordering of radiology studies, radiologist review and reporting of image findings, and oncologist review and integration of imaging information for clinical decision making. We discuss informatics approaches to oncologic imaging including, but not limited to, controlled terminologies, image annotation, and image-processing algorithms. With the ongoing development of novel imaging modalities and imaging biomarkers, we expect these systems will continue to evolve and mature.

    View details for DOI 10.1097/PPO.0b013e3182272f04

    View details for Web of Science ID 000293265100003

    View details for PubMedID 21799326

  • A Bayesian Network for Differentiating Benign From Malignant Thyroid Nodules Using Sonographic and Demographic Features AMERICAN JOURNAL OF ROENTGENOLOGY Liu, Y. I., Kamaya, A., Desser, T. S., Rubin, D. L. 2011; 196 (5): W598-W605


    The objective of our study was to create a Bayesian network (BN) that incorporates a multitude of imaging features and patient demographic characteristics to guide radiologists in assessing the likelihood of malignancy in suspicious-appearing thyroid nodules.We built a BN to combine multiple indicators of the malignant potential of thyroid nodules including both imaging and demographic factors. The imaging features and conditional probabilities relating those features to diagnoses were compiled from an extensive literature review. To evaluate our network, we randomly selected 54 benign and 45 malignant nodules from 93 adult patients who underwent ultrasound-guided biopsy. The final diagnosis in each case was pathologically established. We compared the performance of our network with that of two radiologists who independently evaluated each case on a 5-point scale of suspicion for malignancy. Probability estimates of malignancy from the BN and radiologists were compared using receiver operating characteristic (ROC) analysis.The network performed comparably to the two expert radiologists. Using each radiologist's assessment of the imaging features as input to the network, the differences between the area under the ROC curve (A(z)) for the BN and for the radiologists were -0.03 (BN vs radiologist 1, 0.85 vs 0.88) and -0.01 (BN vs radiologist 2, 0.76 vs 0.77).We created a BN that incorporates a range of sonographic and demographic features and provides a probability about whether a thyroid nodule is benign or malignant. The BN distinguished between benign and malignant thyroid nodules as well as the expert radiologists did.

    View details for DOI 10.2214/AJR.09.4037

    View details for Web of Science ID 000289769000015

    View details for PubMedID 21512051

  • A practical method for transforming free-text eligibility criteria into computable criteria JOURNAL OF BIOMEDICAL INFORMATICS Tu, S. W., Peleg, M., Carini, S., Bobak, M., Ross, J., Rubin, D., Sim, I. 2011; 44 (2): 239-250


    Formalizing eligibility criteria in a computer-interpretable language would facilitate eligibility determination for study subjects and the identification of studies on similar patient populations. Because such formalization is extremely labor intensive, we transform the problem from one of fully capturing the semantics of criteria directly in a formal expression language to one of annotating free-text criteria in a format called ERGO annotation. The annotation can be done manually, or it can be partially automated using natural-language processing techniques. We evaluated our approach in three ways. First, we assessed the extent to which ERGO annotations capture the semantics of 1000 eligibility criteria randomly drawn from Second, we demonstrated the practicality of the annotation process in a feasibility study. Finally, we demonstrate the computability of ERGO annotation by using it to (1) structure a library of eligibility criteria, (2) search for studies enrolling specified study populations, and (3) screen patients for potential eligibility for a study. We therefore demonstrate a new and practical method for incrementally capturing the semantics of free-text eligibility criteria into computable form.

    View details for DOI 10.1016/j.jbi.2010.09.007

    View details for Web of Science ID 000289030100006

    View details for PubMedID 20851207

  • Evaluation of Negation and Uncertainty Detection and its Impact on Precision and Recall in Search JOURNAL OF DIGITAL IMAGING Wu, A. S., Do, B. H., Kim, J., Rubin, D. L. 2011; 24 (2): 234-242


    Radiology reports contain information that can be mined using a search engine for teaching, research, and quality assurance purposes. Current search engines look for exact matches to the search term, but they do not differentiate between reports in which the search term appears in a positive context (i.e., being present) from those in which the search term appears in the context of negation and uncertainty. We describe RadReportMiner, a context-aware search engine, and compare its retrieval performance with a generic search engine, Google Desktop. We created a corpus of 464 radiology reports which described at least one of five findings (appendicitis, hydronephrosis, fracture, optic neuritis, and pneumonia). Each report was classified by a radiologist as positive (finding described to be present) or negative (finding described to be absent or uncertain). The same reports were then classified by RadReportMiner and Google Desktop. RadReportMiner achieved a higher precision (81%), compared with Google Desktop (27%; p < 0.0001). RadReportMiner had a lower recall (72%) compared with Google Desktop (87%; p = 0.006). We conclude that adding negation and uncertainty identification to a word-based radiology report search engine improves the precision of search results over a search engine that does not take this information into account. Our approach may be useful to adopt into current report retrieval systems to help radiologists to more accurately search for radiology reports.

    View details for DOI 10.1007/s10278-009-9250-4

    View details for Web of Science ID 000288394700009

    View details for PubMedID 19902298

  • Ontology-Assisted Analysis of Web Queries to Determine the Knowledge Radiologists Seek JOURNAL OF DIGITAL IMAGING Rubin, D. L., Flanders, A., Kim, W., Siddiqui, K. M., Kahn, C. E. 2011; 24 (1): 160-164


    Radiologists frequently search the Web to find information they need to improve their practice, and knowing the types of information they seek could be useful for evaluating Web resources. Our goal was to develop an automated method to categorize unstructured user queries using a controlled terminology and to infer the type of information users seek. We obtained the query logs from two commonly used Web resources for radiology. We created a computer algorithm to associate RadLex-controlled vocabulary terms with the user queries. Using the RadLex hierarchy, we determined the high-level category associated with each RadLex term to infer the type of information users were seeking. To test the hypothesis that the term category assignments to user queries are non-random, we compared the distributions of the term categories in RadLex with those in user queries using the chi square test. Of the 29,669 unique search terms found in user queries, 15,445 (52%) could be mapped to one or more RadLex terms by our algorithm. Each query contained an average of one to two RadLex terms, and the dominant categories of RadLex terms in user queries were diseases and anatomy. While the same types of RadLex terms were predominant in both RadLex itself and user queries, the distribution of types of terms in user queries and RadLex were significantly different (p?

    View details for DOI 10.1007/s10278-010-9289-2

    View details for Web of Science ID 000286469600018

    View details for PubMedID 20354755

  • Informatics in Radiology RADTF: A Semantic Search-enabled, Natural Language Processor-generated Radiology Teaching File RADIOGRAPHICS Do, B. H., Wu, A., Biswal, S., Kamaya, A., Rubin, D. L. 2010; 30 (7): 2039-2048


    Storing and retrieving radiology cases is an important activity for education and clinical research, but this process can be time-consuming. In the process of structuring reports and images into organized teaching files, incidental pathologic conditions not pertinent to the primary teaching point can be omitted, as when a user saves images of an aortic dissection case but disregards the incidental osteoid osteoma. An alternate strategy for identifying teaching cases is text search of reports in radiology information systems (RIS), but retrieved reports are unstructured, teaching-related content is not highlighted, and patient identifying information is not removed. Furthermore, searching unstructured reports requires sophisticated retrieval methods to achieve useful results. An open-source, RadLex(®)-compatible teaching file solution called RADTF, which uses natural language processing (NLP) methods to process radiology reports, was developed to create a searchable teaching resource from the RIS and the picture archiving and communication system (PACS). The NLP system extracts and de-identifies teaching-relevant statements from full reports to generate a stand-alone database, thus converting existing RIS archives into an on-demand source of teaching material. Using RADTF, the authors generated a semantic search-enabled, Web-based radiology archive containing over 700,000 cases with millions of images. RADTF combines a compact representation of the teaching-relevant content in radiology reports and a versatile search engine with the scale of the entire RIS-PACS collection of case material.

    View details for DOI 10.1148/rg.307105083

    View details for Web of Science ID 000284094200021

    View details for PubMedID 20801868

  • Automated Retrieval of CT Images of Liver Lesions on the Basis of Image Similarity: Method and Preliminary Results RADIOLOGY Napel, S. A., Beaulieu, C. F., Rodriguez, C., Cui, J., Xu, J., Gupta, A., Korenblum, D., Greenspan, H., Ma, Y., Rubin, D. L. 2010; 256 (1): 243-252


    To develop a system to facilitate the retrieval of radiologic images that contain similar-appearing lesions and to perform a preliminary evaluation of this system with a database of computed tomographic (CT) images of the liver and an external standard of image similarity.Institutional review board approval was obtained for retrospective analysis of deidentified patient images. Thereafter, 30 portal venous phase CT images of the liver exhibiting one of three types of liver lesions (13 cysts, seven hemangiomas, 10 metastases) were selected. A radiologist used a controlled lexicon and a tool developed for complete and standardized description of lesions to identify and annotate each lesion with semantic features. In addition, this software automatically computed image features on the basis of image texture and boundary sharpness. Semantic and computer-generated features were weighted and combined into a feature vector representing each image. An independent reference standard was created for pairwise image similarity. This was used in a leave-one-out cross-validation to train weights that optimized the rankings of images in the database in terms of similarity to query images. Performance was evaluated by using precision-recall curves and normalized discounted cumulative gain (NDCG), a common measure for the usefulness of information retrieval.When used individually, groups of semantic, texture, and boundary features resulted in various levels of performance in retrieving relevant lesions. However, combining all features produced the best overall results. Mean precision was greater than 90% at all values of recall, and mean, best, and worst case retrieval accuracy was greater than 95%, 100%, and greater than 78%, respectively, with NDCG.Preliminary assessment of this approach shows excellent retrieval results for three types of liver lesions visible on portal venous CT images, warranting continued development and validation in a larger and more comprehensive database.

    View details for DOI 10.1148/radiol.10091694

    View details for Web of Science ID 000279106900029

    View details for PubMedID 20505065

  • The caBIG (TM) Annotation and Image Markup Project JOURNAL OF DIGITAL IMAGING Channin, D. S., Mongkolwat, P., Kleper, V., Sepukar, K., Rubin, D. L. 2010; 23 (2): 217-225


    Image annotation and markup are at the core of medical interpretation in both the clinical and the research setting. Digital medical images are managed with the DICOM standard format. While DICOM contains a large amount of meta-data about whom, where, and how the image was acquired, DICOM says little about the content or meaning of the pixel data. An image annotation is the explanatory or descriptive information about the pixel data of an image that is generated by a human or machine observer. An image markup is the graphical symbols placed over the image to depict an annotation. While DICOM is the standard for medical image acquisition, manipulation, transmission, storage, and display, there are no standards for image annotation and markup. Many systems expect annotation to be reported verbally, while markups are stored in graphical overlays or proprietary formats. This makes it difficult to extract and compute with both of them. The goal of the Annotation and Image Markup (AIM) project is to develop a mechanism, for modeling, capturing, and serializing image annotation and markup data that can be adopted as a standard by the medical imaging community. The AIM project produces both human- and machine-readable artifacts. This paper describes the AIM information model, schemas, software libraries, and tools so as to prepare researchers and developers for their use of AIM.

    View details for DOI 10.1007/s10278-009-9193-9

    View details for Web of Science ID 000275551400014

    View details for PubMedID 19294468

  • Imaging informatics: toward capturing and processing semantic information in radiology images. Yearbook of medical informatics Rubin, D. L., Napel, S. 2010: 34-42


    To identify challenges and opportunities in imaging informatics that can lead to the use of images for discovery, and that can potentially improve the diagnostic accuracy of imaging professionals.Recent articles on imaging informatics and related articles from PubMed were reviewed and analyzed. Some new developments and challenges that recent research in imaging informatics will meet are identified and discussed.While much literature continues to be devoted to traditional imaging informatics topics of image processing, visualization, and computerized detection, three new trends are emerging: (1) development of ontologies to describe radiology reports and images, (2) structured reporting and image annotation methods to make image semantics explicit and machine-accessible, and (3) applications that use semantic image information for decision support to improve radiologist interpretation performance. The informatics methods being developed have similarities and synergies with recent work in the biomedical informatics community that leverage large high-throughput data sets, and future research in imaging informatics will build on these advances to enable discovery by mining large image databases.Imaging informatics is beginning to develop and apply knowledge representation and analysis methods to image datasets. This type of work, already commonplace in biomedical research with large scale molecular and clinical datasets, will lead to new ways for computers to work with image data. The new advances hold promise for integrating imaging with the rest of the patient record as well as molecular data, for new data-driven discoveries in imaging analogous to that in bioinformatics, and for improved quality of radiology practice.

    View details for PubMedID 20938568

  • The Annotation and Image Mark-up Project RADIOLOGY Channin, D. S., Mongkolwat, P., Kleper, V., Rubin, D. L. 2009; 253 (3): 590-592

    View details for DOI 10.1148/radiol.2533090135

    View details for Web of Science ID 000272247300003

    View details for PubMedID 19952021

  • BioPortal: ontologies and integrated data resources at the click of a mouse NUCLEIC ACIDS RESEARCH Noy, N. F., Shah, N. H., Whetzel, P. L., Dai, B., Dorf, M., Griffith, N., Jonquet, C., Rubin, D. L., Storey, M., Chute, C. G., Musen, M. A. 2009; 37: W170-W173


    Biomedical ontologies provide essential domain knowledge to drive data integration, information retrieval, data annotation, natural-language processing and decision support. BioPortal ( is an open repository of biomedical ontologies that provides access via Web services and Web browsers to ontologies developed in OWL, RDF, OBO format and Protégé frames. BioPortal functionality includes the ability to browse, search and visualize ontologies. The Web interface also facilitates community-based participation in the evaluation and evolution of ontology content by providing features to add notes to ontology terms, mappings between terms and ontology reviews based on criteria such as usability, domain coverage, quality of content, and documentation and support. BioPortal also enables integrated search of biomedical data resources such as the Gene Expression Omnibus (GEO),, and ArrayExpress, through the annotation and indexing of these resources with ontologies in BioPortal. Thus, BioPortal not only provides investigators, clinicians, and developers 'one-stop shopping' to programmatically access biomedical ontologies, but also provides support to integrate data from a variety of biomedical resources.

    View details for DOI 10.1093/nar/gkp440

    View details for Web of Science ID 000267889100031

    View details for PubMedID 19483092

  • Informatics Methods to Enable Patient-centered Radiology ACADEMIC RADIOLOGY Rubin, D. L. 2009; 16 (5): 524-534


    Informatics methods and systems in support of clinical care are well established in the health care enterprise. The new paradigm of patient-centered radiology creates new requirements and challenges that can be enabled by informatics. In particular, computer support can help referring physicians tailor their imaging requests to those procedures that would be most helpful for their patients'clinical context. Informatics methods can assist radiologists in recognizing important findings in images as well as helping them decide the best course of action for patients given the radiologic imaging results and other clinical data. Finally, informatics methods can help engage patients in their care by providing information about their imaging procedures and results. All of these informatics technologies share in common the ability to bring together critical knowledge filtered according to the specific requirements of patients undergoing radiologic imaging, a key component of patient-centered radiology. The goals of this article are to review the opportunities for informatics in supporting patient-centered radiology, to demonstrate the potential utility of these methods, and to point radiologists to the ways that informatics will help them provide care that is tailored to each patient.

    View details for DOI 10.1016/j.acra.2009.01.009

    View details for Web of Science ID 000265229500004

    View details for PubMedID 19345892

  • Computational neuroanatomy: ontology-based representation of neural components and connectivity BMC BIOINFORMATICS Rubin, D. L., Talos, I., Halle, M., Musen, M. A., Kikinis, R. 2009; 10


    A critical challenge in neuroscience is organizing, managing, and accessing the explosion in neuroscientific knowledge, particularly anatomic knowledge. We believe that explicit knowledge-based approaches to make neuroscientific knowledge computationally accessible will be helpful in tackling this challenge and will enable a variety of applications exploiting this knowledge, such as surgical planning.We developed ontology-based models of neuroanatomy to enable symbolic lookup, logical inference and mathematical modeling of neural systems. We built a prototype model of the motor system that integrates descriptive anatomic and qualitative functional neuroanatomical knowledge. In addition to modeling normal neuroanatomy, our approach provides an explicit representation of abnormal neural connectivity in disease states, such as common movement disorders. The ontology-based representation encodes both structural and functional aspects of neuroanatomy. The ontology-based models can be evaluated computationally, enabling development of automated computer reasoning applications.Neuroanatomical knowledge can be represented in machine-accessible format using ontologies. Computational neuroanatomical approaches such as described in this work could become a key tool in translational informatics, leading to decision support applications that inform and guide surgical planning and personalized care for neurological disease in the future.

    View details for DOI 10.1186/1471-2105-10-S2-S3

    View details for Web of Science ID 000265602500004

    View details for PubMedID 19208191

  • A Controlled Vocabulary to Represent Sonographic Features of the Thyroid and its application in a Bayesian Network to Predict Thyroid Nodule Malignancy. Summit on translational bioinformatics Liu, Y. I., Kamaya, A., Desser, T. S., Rubin, D. L. 2009; 2009: 68-72


    It is challenging to distinguish benign from malignant thyroid nodules on high resolution ultrasound. Many ultrasound features have been studied individually as predictors for thyroid malignancy, none with a high degree of accuracy, and there is no consistent vocabulary used to describe the features. Our hypothesis is that a standard vocabulary will advance accuracy. We performed a systemic literature review and identified all the sonographic features that have been well studied in thyroid cancers. We built a controlled vocabulary for describing sonographic features and to enable us to unify data in the literature on the predictive power of each feature. We used this terminology to build a Bayesian network to predict thyroid malignancy. Our Bayesian network performed similar to or slightly better than experienced radiologists. Controlled terminology for describing thyroid radiology findings could be useful to characterize thyroid nodules and could enable decision support applications.

    View details for PubMedID 21347173

  • Semantic reasoning with image annotations for tumor assessment. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium Levy, M. A., O'Connor, M. J., Rubin, D. L. 2009; 2009: 359-363


    Identifying, tracking and reasoning about tumor lesions is a central task in cancer research and clinical practice that could potentially be automated. However, information about tumor lesions in imaging studies is not easily accessed by machines for automated reasoning. The Annotation and Image Markup (AIM) information model recently developed for the cancer Biomedical Informatics Grid provides a method for encoding the semantic information related to imaging findings, enabling their storage and transfer. However, it is currently not possible to apply automated reasoning methods to image information encoded in AIM. We have developed a methodology and a suite of tools for transforming AIM image annotations into OWL, and an ontology for reasoning with the resulting image annotations for tumor lesion assessment. Our methods enable automated inference of semantic information about cancer lesions in images.

    View details for PubMedID 20351880

  • Comparison of concept recognizers for building the Open Biomedical Annotator BMC BIOINFORMATICS Shah, N. H., Bhatia, N., Jonquet, C., Rubin, D., Chiang, A. P., Musen, M. A. 2009; 10


    The National Center for Biomedical Ontology (NCBO) is developing a system for automated, ontology-based access to online biomedical resources (Shah NH, et al.: Ontology-driven indexing of public datasets for translational bioinformatics. BMC Bioinformatics 2009, 10(Suppl 2):S1). The system's indexing workflow processes the text metadata of diverse resources such as datasets from GEO and ArrayExpress to annotate and index them with concepts from appropriate ontologies. This indexing requires the use of a concept-recognition tool to identify ontology concepts in the resource's textual metadata. In this paper, we present a comparison of two concept recognizers - NLM's MetaMap and the University of Michigan's Mgrep. We utilize a number of data sources and dictionaries to evaluate the concept recognizers in terms of precision, recall, speed of execution, scalability and customizability. Our evaluations demonstrate that Mgrep has a clear edge over MetaMap for large-scale service oriented applications. Based on our analysis we also suggest areas of potential improvements for Mgrep. We have subsequently used Mgrep to build the Open Biomedical Annotator service. The Annotator service has access to a large dictionary of biomedical terms derived from the United Medical Language System (UMLS) and NCBO ontologies. The Annotator also leverages the hierarchical structure of the ontologies and their mappings to expand annotations. The Annotator service is available to the community as a REST Web service for creating ontology-based annotations of their data.

    View details for DOI 10.1186/1471-2105-10-S9-S14

    View details for Web of Science ID 000270371700015

    View details for PubMedID 19761568

  • Creating and Curating a Terminology for Radiology: Ontology Modeling and Analysis JOURNAL OF DIGITAL IMAGING Rubin, D. L. 2008; 21 (4): 355-362


    The radiology community has recognized the need to create a standard terminology to improve the clarity of reports, to reduce radiologist variation, to enable access to imaging information, and to improve the quality of practice. This need has recently led to the development of RadLex, a controlled terminology for radiology. The creation of RadLex has proved challenging in several respects: It has been difficult for users to peruse the large RadLex taxonomies and for curators to navigate the complex terminology structure to check it for errors and omissions. In this work, we demonstrate that the RadLex terminology can be translated into an ontology, a representation of terminologies that is both human-browsable and machine-processable. We also show that creating this ontology permits computational analysis of RadLex and enables its use in a variety of computer applications. We believe that adopting an ontology representation of RadLex will permit more widespread use of the terminology and make it easier to collect feedback from the community that will ultimately lead to improving RadLex.

    View details for DOI 10.1007/s10278-007-9073-0

    View details for Web of Science ID 000260689900001

    View details for PubMedID 17874267

  • Network analysis of intrinsic functional brain connectivity in Alzheimer's disease PLOS COMPUTATIONAL BIOLOGY Supekar, K., Menon, V., Rubin, D., Musen, M., Greicius, M. D. 2008; 4 (6)


    Functional brain networks detected in task-free ("resting-state") functional magnetic resonance imaging (fMRI) have a small-world architecture that reflects a robust functional organization of the brain. Here, we examined whether this functional organization is disrupted in Alzheimer's disease (AD). Task-free fMRI data from 21 AD subjects and 18 age-matched controls were obtained. Wavelet analysis was applied to the fMRI data to compute frequency-dependent correlation matrices. Correlation matrices were thresholded to create 90-node undirected-graphs of functional brain networks. Small-world metrics (characteristic path length and clustering coefficient) were computed using graph analytical methods. In the low frequency interval 0.01 to 0.05 Hz, functional brain networks in controls showed small-world organization of brain activity, characterized by a high clustering coefficient and a low characteristic path length. In contrast, functional brain networks in AD showed loss of small-world properties, characterized by a significantly lower clustering coefficient (p<0.01), indicative of disrupted local connectivity. Clustering coefficients for the left and right hippocampus were significantly lower (p<0.01) in the AD group compared to the control group. Furthermore, the clustering coefficient distinguished AD participants from the controls with a sensitivity of 72% and specificity of 78%. Our study provides new evidence that there is disrupted organization of functional brain networks in AD. Small-world metrics can characterize the functional organization of the brain in AD, and our findings further suggest that these network measures may be useful as an imaging-based biomarker to distinguish AD from healthy aging.

    View details for DOI 10.1371/journal.pcbi.1000100

    View details for Web of Science ID 000259786700013

    View details for PubMedID 18584043

  • A data warehouse for integrating radiologic and pathologic data. Journal of the American College of Radiology Rubin, D. L., Desser, T. S. 2008; 5 (3): 210-217


    Much of the information needed for radiology teaching and research is not in the picture archiving and communication system but distributed in hospital information systems throughout the medical enterprise. Our objective is to describe the design, methodology, and implementation of a data warehouse to integrate and make accessible the types of medical data pertinent to radiology research and teaching, and to encourage implementation of similar approaches throughout the radiologic community.We identified desiderata of radiology data warehouses and designed and implemented a prototype system (RadBank) to meet these needs. RadBank was built with open-source software tools on a Linux platform with a relational database. We created a text report parsing module that recognizes the structure of radiology reports and makes individual sections available for indexing and search. A database schema was designed to link radiology and pathology reports and to enable users to retrieve cases using flexible queries.Our system contains more than 2 million radiology and pathology reports, and allows full text search by patient history, findings, and diagnosis by radiology and pathology. RadBank has helped radiologists at our institution find teaching cases and identify research cohorts.Data warehouses can provide radiologists access to important clinical information contained in radiology and pathology reports, and supplement the image information in picture archiving and communication system workstations. We believe that data warehouses similar to our system can be implemented in other radiology departments within a reasonable budget to make their vast radiologic-pathologic case material accessible for education and research.

    View details for DOI 10.1016/j.jacr.2007.09.004

    View details for PubMedID 18312970

  • Tool support to enable evaluation of the clinical response to treatment. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium Levy, M. A., Rubin, D. L. 2008: 399-403


    Objective criteria for measuring response to cancer treatment are critical to clinical research and practice. The National Cancer Institute has developed the Response Evaluation Criteria in Solid Tumors (RECIST) method to quantify treatment response. RECIST evaluates response by assessing a set of measurable target lesions in baseline and follow-up radiographic studies. However, applying RECIST consistently is challenging due to inter-observer variability among oncologists and radiologists in choice and measurement of target lesions. We analyzed the radiologist-oncologist workflow to determine whether the information collected is sufficient for reliably applying RECIST. We evaluated radiology reports and image markup (radiologists), and clinical flow sheets (oncologists). We found current reporting of radiology results insufficient for consistent application of RECIST, compared with flow sheets. We identified use cases and functional requirements for an informatics tool that could improve consistency and accuracy in applying methods such as RECIST.

    View details for PubMedID 18998923

  • iPad: Semantic annotation and markup of radiological images. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium Rubin, D. L., Rodriguez, C., Shah, P., Beaulieu, C. 2008: 626-630


    Radiological images contain a wealth of information,such as anatomy and pathology, which is often not explicit and computationally accessible. Information schemes are being developed to describe the semantic content of images, but such schemes can be unwieldy to operationalize because there are few tools to enable users to capture structured information easily as part of the routine research workflow. We have created iPad, an open source tool enabling researchers and clinicians to create semantic annotations on radiological images. iPad hides the complexity of the underlying image annotation information model from users, permitting them to describe images and image regions using a graphical interface that maps their descriptions to structured ontologies semi-automatically. Image annotations are saved in a variety of formats,enabling interoperability among medical records systems, image archives in hospitals, and the Semantic Web. Tools such as iPad can help reduce the burden of collecting structured information from images, and it could ultimately enable researchers and physicians to exploit images on a very large scale and glean the biological and physiological significance of image content.

    View details for PubMedID 18999144

  • A Bayesian classifier for differentiating benign versus malignant thyroid nodules using sonographic features. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium Liu, Y. I., Kamaya, A., Desser, T. S., Rubin, D. L. 2008: 419-423


    Thyroid nodules are a common, yet challenging clinical problem. The vast majority of these nodules are benign; however, deciding which nodule should undergo biopsy is difficult because the imaging appearance of benign and malignant thyroid nodules overlap. High resolution ultrasound is the primary imaging modality for evaluating thyroid nodules. Many sonographic features have been studied individually as predictors for thyroid malignancy. There has been little work to create predictive models that combine multiple predictors, both imaging features and demographic factors. We have created a Bayesian classifier to predict whether a thyroid nodule is benign or malignant using sonographic and demographic findings. Our classifier performed similar to or slightly better than experienced radiologists when evaluated using 41 thyroid nodules with known pathologic diagnosis. This classifier could be helpful in providing practitioners an objective basis for deciding whether to biopsy suspicious thyroid nodules.

    View details for PubMedID 18999209

  • Biomedical ontologies: a functional perspective BRIEFINGS IN BIOINFORMATICS Rubin, D. L., Shah, N. H., Noy, N. F. 2008; 9 (1): 75-90


    The information explosion in biology makes it difficult for researchers to stay abreast of current biomedical knowledge and to make sense of the massive amounts of online information. Ontologies--specifications of the entities, their attributes and relationships among the entities in a domain of discourse--are increasingly enabling biomedical researchers to accomplish these tasks. In fact, bio-ontologies are beginning to proliferate in step with accruing biological data. The myriad of ontologies being created enables researchers not only to solve some of the problems in handling the data explosion but also introduces new challenges. One of the key difficulties in realizing the full potential of ontologies in biomedical research is the isolation of various communities involved: some workers spend their career developing ontologies and ontology-related tools, while few researchers (biologists and physicians) know how ontologies can accelerate their research. The objective of this review is to give an overview of biomedical ontology in practical terms by providing a functional perspective--describing how bio-ontologies can and are being used. As biomedical scientists begin to recognize the many different ways ontologies enable biomedical research, they will drive the emergence of new computer applications that will help them exploit the wealth of research data now at their fingertips.

    View details for DOI 10.1093/bib/bbm059

    View details for Web of Science ID 000251864600008

    View details for PubMedID 18077472

  • BioPortal: ontologies and data resources with the click of a mouse. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium Musen, M. A., Shah, N. H., Noy, N. F., Dai, B. Y., Dorf, M., Griffith, N., Buntrok, J., Jonquet, C., Montegut, M. J., Rubin, D. L. 2008: 1223-1224

    View details for PubMedID 18999306

  • Protege: A tool for managing and using terminology in radiology applications JOURNAL OF DIGITAL IMAGING Rubin, D. L., Noy, N. F., Musen, M. A. 2007; 20: 34-46


    The development of standard terminologies such as RadLex is becoming important in radiology applications, such as structured reporting, teaching file authoring, report indexing, and text mining. The development and maintenance of these terminologies are challenging, however, because there are few specialized tools to help developers to browse, visualize, and edit large taxonomies. Protégé ( ) is an open-source tool that allows developers to create and to manage terminologies and ontologies. It is more than a terminology-editing tool, as it also provides a platform for developers to use the terminologies in end-user applications. There are more than 70,000 registered users of Protégé who are using the system to manage terminologies and ontologies in many different domains. The RadLex project has recently adopted Protégé for managing its radiology terminology. Protégé provides several features particularly useful to managing radiology terminologies: an intuitive graphical user interface for navigating large taxonomies, visualization components for viewing complex term relationships, and a programming interface so developers can create terminology-driven radiology applications. In addition, Protégé has an extensible plug-in architecture, and its large user community has contributed a rich library of components and extensions that provide much additional useful functionalities. In this report, we describe Protégé's features and its particular advantages in the radiology domain in the creation, maintenance, and use of radiology terminology.

    View details for DOI 10.1007/s10278-007-9065-0

    View details for Web of Science ID 000250825300004

    View details for PubMedID 17687607

  • Annotation and query of tissue microarray data using the NCI Thesaurus BMC BIOINFORMATICS Shah, N. H., Rubin, D. L., Espinosa, I., Montgomery, K., Musen, M. A. 2007; 8


    The Stanford Tissue Microarray Database (TMAD) is a repository of data serving a consortium of pathologists and biomedical researchers. The tissue samples in TMAD are annotated with multiple free-text fields, specifying the pathological diagnoses for each sample. These text annotations are not structured according to any ontology, making future integration of this resource with other biological and clinical data difficult.We developed methods to map these annotations to the NCI thesaurus. Using the NCI-T we can effectively represent annotations for about 86% of the samples. We demonstrate how this mapping enables ontology driven integration and querying of tissue microarray data. We have deployed the mapping and ontology driven querying tools at the TMAD site for general use.We have demonstrated that we can effectively map the diagnosis-related terms describing a sample in TMAD to the NCI-T. The NCI thesaurus terms have a wide coverage and provide terms for about 86% of the samples. In our opinion the NCI thesaurus can facilitate integration of this resource with other biological data.

    View details for DOI 10.1186/1471-2105-8-296

    View details for Web of Science ID 000249734300001

    View details for PubMedID 17686183

  • Knowledge Zone: A Public Repository of Peer-Reviewed Biomedical Ontologies MEDINFO 2007: PROCEEDINGS OF THE 12TH WORLD CONGRESS ON HEALTH (MEDICAL) INFORMATICS, PTS 1 AND 2 Supekar, K., Rubin, D., Noy, N., Musen, M. 2007; 129: 812-816


    Reuse of ontologies is important for achieving better interoperability among health systems and relieving knowledge engineers from the burden of developing ontologies from scratch. Most of the work that aims to facilitate ontology reuse has focused on building ontology libraries that are simple repositories of ontologies or has led to keyword-based search tools that search among ontologies. To our knowledge, there are no operational methodologies that allow users to evaluate ontologies and to compare them in order to choose the most appropriate ontology for their task. In this paper, we present, Knowledge Zone - a Web-based portal that allows users to submit their ontologies, to associate metadata with their ontologies, to search for existing ontologies, to find ontology rankings based on user reviews, to post their own reviews, and to rate reviews.

    View details for Web of Science ID 000272064000163

    View details for PubMedID 17911829

  • LesionViewer: a tool for tracking cancer lesions over time. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium Levy, M. A., Garg, A., Tam, A., Garten, Y., Rubin, D. L. 2007: 443-447


    Oncologists managing cancer patients use radiology imaging studies to evaluate changes in measurable cancer lesions. Currently, the textual radiology report summarizes the findings, but is disconnected from the primary image data. This makes it difficult for the physician to obtain a visual overview of the location and behavior of the disease. LesionViewer is a prototype software system designed to assist clinicians in comprehending and reviewing radiology imaging studies. The interface provides an Anatomical Summary View of the location of lesions identified in a series of studies, and direct navigation to the relevant primary image data. LesionViewer's Disease Summary View provides a temporal abstraction of the disease behavior between studies utilizing methods of the RECIST guideline. In a usability study, nine physicians used the system to accurately perform clinical tasks appropriate to the analysis of radiology reports and image data. All users reported they would use the system if available.

    View details for PubMedID 18693875

  • Using ontologies linked with geometric models to reason about penetrating injuries ARTIFICIAL INTELLIGENCE IN MEDICINE Rubin, D. L., Dameron, O., Bashir, Y., Grossman, D., Dev, P., Musen, M. A. 2006; 37 (3): 167-176


    Medical assessment of penetrating injuries is a difficult and knowledge-intensive task, and rapid determination of the extent of internal injuries is vital for triage and for determining the appropriate treatment. Physical examination and computed tomographic (CT) imaging data must be combined with detailed anatomic, physiologic, and biomechanical knowledge to assess the injured subject. We are developing a methodology to automate reasoning about penetrating injuries using canonical knowledge combined with specific subject image data.In our approach, we build a three-dimensional geometric model of a subject from segmented images. We link regions in this model to entities in two knowledge sources: (1) a comprehensive ontology of anatomy containing organ identities, adjacencies, and other information useful for anatomic reasoning and (2) an ontology of regional perfusion containing formal definitions of arterial anatomy and corresponding regions of perfusion. We created computer reasoning services ("problem solvers") that use the ontologies to evaluate the geometric model of the subject and deduce the consequences of penetrating injuries.We developed and tested our methods using data from the Visible Human. Our problem solvers can determine the organs that are injured given particular trajectories of projectiles, whether vital structures--such as a coronary artery--are injured, and they can predict the propagation of injury ensuing after vital structures are injured.We have demonstrated the capability of using ontologies with medical images to support computer reasoning about injury based on those images. Our methodology demonstrates an approach to creating intelligent computer applications that reason with image data, and it may have value in helping practitioners in the assessment of penetrating injury.

    View details for DOI 10.1016/j.artmed.2006.03.006

    View details for Web of Science ID 000238992500002

    View details for PubMedID 16730959

  • National Center for Biomedical Ontology: Advancing biomedicine through structured organization of scientific knowledge OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY Rubin, D. L., Lewis, S. E., Mungall, C. J., Misra, S., Westerfield, M., Ashburner, M., Sim, I., Chute, C. G., Solbrig, H., Storey, M., Smith, B., Day-Richter, J., Noy, N. F., Musen, M. A. 2006; 10 (2): 185-198


    The National Center for Biomedical Ontology is a consortium that comprises leading informaticians, biologists, clinicians, and ontologists, funded by the National Institutes of Health (NIH) Roadmap, to develop innovative technology and methods that allow scientists to record, manage, and disseminate biomedical information and knowledge in machine-processable form. The goals of the Center are (1) to help unify the divergent and isolated efforts in ontology development by promoting high quality open-source, standards-based tools to create, manage, and use ontologies, (2) to create new software tools so that scientists can use ontologies to annotate and analyze biomedical data, (3) to provide a national resource for the ongoing evaluation, integration, and evolution of biomedical ontologies and associated tools and theories in the context of driving biomedical projects (DBPs), and (4) to disseminate the tools and resources of the Center and to identify, evaluate, and communicate best practices of ontology development to the biomedical community. Through the research activities within the Center, collaborations with the DBPs, and interactions with the biomedical community, our goal is to help scientists to work more effectively in the e-science paradigm, enhancing experiment design, experiment execution, data analysis, information synthesis, hypothesis generation and testing, and understand human disease.

    View details for Web of Science ID 000240210900015

    View details for PubMedID 16901225

  • Coverage of emergency after-hours ultrasound cases: Survey of practices at US teaching hospitals ACADEMIC RADIOLOGY Desser, T. S., Rubin, D. L., Schraedley-Desmond, P. 2006; 13 (2): 249-253


    Diagnostic ultrasound examinations may be performed after-hours by physicians if technologists are not available or cases are complex. Our experience suggested there is wide variability in how ultrasound coverage is provided after-hours, which motivated us to conduct a formal survey of teaching programs around the country.Four hundred five members of the Association of Program Directors in Radiology were contacted by e-mail and sent a link to a five-part questionnaire posted on the Web. Respondents were asked whether ultrasound cases after-hours are performed in their institutions by radiology residents, technologists on the premises after-hours, technologists on-call, or some combination. Data on the type of program, number of beds in the primary hospital, number of residents in the program, and geographic location of the program were recorded. Responses were automatically written to a data file stored on a Web server and the imported into an Excel spreadsheet for data analysis. A chi(2) analysis was performed to assess associations among the variables and statistical significance.A total of 79 programs responded to the survey. Of those, 32% provided coverage with ultrasound technologists on call, 24% by ultrasound technologists on the premises, 13% provided combination coverage, and 10% provided coverage solely with residents on call. There was no association among number of residents in the program, location of the program, or type of program (university, community, or affiliated) and type of coverage provided.There is wide variability in methods for providing coverage of after-hours ultrasound cases. However, on-site or on-call coverage of emergency cases by technologists did not appear to depend significantly on program location, program type, or program size.

    View details for DOI 10.1016/j.acra.2005.09.091

    View details for Web of Science ID 000235107800016

    View details for PubMedID 16428062

  • Ontology-based representation of simulation models of physiology. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium Rubin, D. L., Grossman, D., Neal, M., Cook, D. L., Bassingthwaighte, J. B., Musen, M. A. 2006: 664-668


    Dynamic simulation models of physiology are often represented as a set of mathematical equations. Such models are very useful for studying and understanding the dynamic behavior of physiological variables. However, the sheer number of equations and variables can make these models unwieldy, difficult to under-stand, and challenging to maintain. We describe a symbolic, ontologically-guided methodology for representing a physiological model of the circulation. We created an ontology describing the types of equations in the model as well as the anatomic components and how they are connected to form a circulatory loop. The ontology provided an explicit representation of the model, both its mathematical and anatomic content, abstracting and hiding much of the mathematical complexity. The ontology also provided a framework to construct a graphical representation of the model, providing a simpler visualization than the large set of mathematical equations. Our approach may help model builders to maintain, debug, and extend simulation models.

    View details for PubMedID 17238424

  • Ontology-based annotation and query of tissue microarray data. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium Shah, N. H., Rubin, D. L., Supekar, K. S., Musen, M. A. 2006: 709-713


    The Stanford Tissue Microarray Database (TMAD) is a repository of data amassed by a consortium of pathologists and biomedical researchers. The TMAD data are annotated with multiple free-text fields, specifying the pathological diagnoses for each tissue sample. These annotations are spread out over multiple text fields and are not structured according to any ontology, making it difficult to integrate this resource with other biological and clinical data. We developed methods to map these annotations to the NCI thesaurus and the SNOMED-CT ontologies. Using these two ontologies we can effectively represent about 80% of the annotations in a structured manner. This mapping offers the ability to perform ontology driven querying of the TMAD data. We also found that 40% of annotations can be mapped to terms from both ontologies, providing the potential to align the two ontologies based on experimental data. Our approach provides the basis for a data-driven ontology alignment by mapping annotations of experimental data.

    View details for PubMedID 17238433

  • A statistical approach to scanning the biomedical literature for pharmacogenetics knowledge JOURNAL OF THE AMERICAN MEDICAL INFORMATICS ASSOCIATION Rubin, D. L., Thorn, C. F., Klein, T. E., Altman, R. B. 2005; 12 (2): 121-129


    Biomedical databases summarize current scientific knowledge, but they generally require years of laborious curation effort to build, focusing on identifying pertinent literature and data in the voluminous biomedical literature. It is difficult to manually extract useful information embedded in the large volumes of literature, and automated intelligent text analysis tools are becoming increasingly essential to assist in these curation activities. The goal of the authors was to develop an automated method to identify articles in Medline citations that contain pharmacogenetics data pertaining to gene-drug relationships.The authors built and evaluated several candidate statistical models that characterize pharmacogenetics articles in terms of word usage and the profile of Medical Subject Headings (MeSH) used in those articles. The best-performing model was used to scan the entire Medline article database (11 million articles) to identify candidate pharmacogenetics articles.A sampling of the articles identified from scanning Medline was reviewed by a pharmacologist to assess the precision of the method. The authors' approach identified 4,892 pharmacogenetics articles in the literature with 92% precision. Their automated method took a fraction of the time to acquire these articles compared with the time expected to be taken to accumulate them manually. The authors have built a Web resource ( to provide access to their results.A statistical classification approach can screen the primary literature to pharmacogenetics articles with high precision. Such methods may assist curators in acquiring pertinent literature in building biomedical databases.

    View details for DOI 10.1197/jamia.M1640

    View details for Web of Science ID 000227842000003

    View details for PubMedID 15561790

  • Challenges in converting frame-based ontology into OWL: the Foundational Model of Anatomy case-study. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium Dameron, O., Rubin, D. L., Musen, M. A. 2005: 181-185


    A description logics representation of the Foundational Model of Anatomy (FMA) in the Web Ontology Language (OWL-DL) would allow developers to combine it with other OWL ontologies, and would provide the benefit of being able to access generic reasoning tools. However, the FMA is currently represented in a frame language. The differences between description logics and frames are not only syntactic, but also semantic. We analyze some theoretical and computational limitations of converting the FMA into OWL-DL. Namely, some of the constructs used in the FMA do not have a direct equivalent in description logics, and a complete conversion of the FMA in description logics is too large to support reasoning. Therefore, an OWL-DL representation of the FMA would have to be optimized for each application. We propose a solution based on OWL-Full, a superlanguage of OWL-DL, that meets the expressiveness requirements and remains application-independent. Specific simplified OWL-DL representations can then be generated from the OWL-Full model by applications. We argue that this solution is easier to implement and closer to the application needs than an integral translation, and that the latter approach would only make the FMA maintenance more difficult.

    View details for PubMedID 16779026

  • Use of description logic classification to reason about consequences of penetrating injuries. AMIA ... Annual Symposium proceedings / AMIA Symposium. AMIA Symposium Rubin, D. L., Dameron, O., Musen, M. A. 2005: 649-653


    The consequences of penetrating injuries can be complex, including abnormal blood flow through the injury channel and functional impairment of organs if arteries supplying them have been severed. Determining the consequences of such injuries can be posed as a classification problem, requiring a priori symbolic knowledge of anatomy. We hypothesize that such symbolic knowledge can be modeled using ontologies, and that the reasoning task can be accomplished using knowl-edge representation in description logics (DL) and automatic classification. We demonstrate the capabilities of automated classification using the Web Ontology Language (OWL) to reason about the consequences of penetrating injuries. We created in OWL a knowledge model of chest and heart anatomy describing the heart structure and the surrounding anatomic compartments, as well as the perfusion of regions of the heart by branches of the coronary arteries. We then used a domain-independent classifier to infer ischemic regions of the heart as well as anatomic spaces containing ectopic blood secondary to the injuries. Our results highlight the advantages of posing reasoning problems as a classification task, and lever-aging the automatic classification capabilities of DL to create intelligent applications.

    View details for PubMedID 16779120

  • Using an Ontology of Human Anatomy to Inform Reasoning with Geometric Models MEDICINE MEETS VIRTUAL REALITY 13: THE MAGICAL NEXT BECOMES THE MEDICAL NOW Rubin, D. L., Bashir, Y., Grossman, D., Dev, P., Musen, M. A. 2005; 111: 429-435


    The Virtual Soldier project is a large effort on the part of the U.S. Defense Advanced Research Projects agency to explore using both general anatomical knowledge and specific computed tomographic (CT) images of individual soldiers to aid the rapid diagnosis and treatment of penetrating injuries. Our goal is to develop intelligent computer applications that use this knowledge to reason about the anatomic structures that are directly injured and to predict propagation of injuries secondary to primary organ damage. To accomplish this, we needed to develop an architecture to combine geometric data with anatomic knowledge and reasoning services that use this information to predict the consequences of injuries.

    View details for Web of Science ID 000273828700086

    View details for PubMedID 15718773

  • A resource to acquire and summarize pharmacogenetics knowledge in the literature MEDINFO 2004: PROCEEDINGS OF THE 11TH WORLD CONGRESS ON MEDICAL INFORMATICS, PT 1 AND 2 Rubin, D. L., Carrillo, M., Woon, M., Conroy, J., Klein, T. E., Altman, R. B. 2004; 107: 793-797


    To determine how genetic variations contribute the variations in drug response, we need to know the genes that are related to drugs of interest. But there are no publicly available data-bases of known gene-drug relationships, and it is time-consuming to search the literature for this information. We have developed a resource to support the storage, summarization, and dissemination of key gene-drug interactions of relevance to pharmacogenetics. Extracting all gene-drug relationships from the literature is a daunting task, so we distributed a tool to acquire this knowledge from the scientific community. We also developed a categorization scheme to classify gene-drug relationships according to the type of pharmacogenetic evidence that supports them. Our resource ( can be queried by gene or drug, and it summarizes gene-drug relationships, categories of evidence, and supporting literature. This resource is growing, containing entries for 138 genes and 215 drugs of pharmacogenetics significance, and is a core component of PharmGKB, a pharmacogenetics knowledge base (

    View details for Web of Science ID 000226723300159

    View details for PubMedID 15360921

  • Linking ontologies with three-dimensional models of anatomy to predict the effects of penetrating injuries. Conference proceedings : ... Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Conference Rubin, D. L., Bashir, Y., Grossman, D., Dev, P., Musen, M. A. 2004; 5: 3128-3131


    Rapid diagnosis of penetrating injuries is essential to increased chance of survival. Geometric models representing anatomic structures could be useful, but such models generally contain only information about the relationships of points in space as well as display properties. We describe an approach to predicting the anatomic consequences of penetrating injury by creating a geometric model of anatomy that integrates biomechanical and anatomic knowledge. We created a geometric model of the heart from the Visible Human image data set. We linked this geometric model of anatomy with an ontology of descriptive anatomic knowledge. A hierarchy of abstract geometric objects was created that represents organs and organ parts. These geometric objects contain information about organ identity, composition, adjacency, and tissue biomechanical properties. This integrated model can support anatomic reasoning. Given a bullet trajectory and a parametric representation of a cone of tissue damage, we can use our model to predict the organs and organ parts that are injured. Our model is extensible, being able to incorporate future information, such as physiological implications of organ injuries.

    View details for PubMedID 17270942

  • Indexing pharmacogenetic knowledge on the World Wide Web PHARMACOGENETICS Altman, R. B., Flockhart, D. A., Sherry, S. T., Oliver, D. E., Rubin, D. L., Klein, T. E. 2003; 13 (1): 3-5

    View details for Web of Science ID 000180584000002

    View details for PubMedID 12544507

  • PharmGKB: The Pharmacogenetics Knowledge Base NUCLEIC ACIDS RESEARCH Hewett, M., Oliver, D. E., Rubin, D. L., Easton, K. L., Stuart, J. M., Altman, R. B., Klein, T. E. 2002; 30 (1): 163-165


    The Pharmacogenetics Knowledge Base (PharmGKB; contains genomic, phenotype and clinical information collected from ongoing pharmacogenetic studies. Tools to browse, query, download, submit, edit and process the information are available to registered research network members. A subset of the tools is publicly available. PharmGKB currently contains over 150 genes under study, 14 Coriell populations and a large ontology of pharmacogenetics concepts. The pharmacogenetic concepts and the experimental data are interconnected by a set of relations to form a knowledge base of information for pharmacogenetic researchers. The information in PharmGKB, and its associated tools for processing that information, are tailored for leading-edge pharmacogenetics research. The PharmGKB project was initiated in April 2000 and the first version of the knowledge base went online in February 2001.

    View details for Web of Science ID 000173077100041

    View details for PubMedID 11752281

  • Automating data acquisition into ontologies from pharmacogenetics relational data sources using declarative object definitions and XML. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Rubin, D. L., Hewett, M., Oliver, D. E., Klein, T. E., Altman, R. B. 2002: 88-99


    Ontologies are useful for organizing large numbers of concepts having complex relationships, such as the breadth of genetic and clinical knowledge in pharmacogenomics. But because ontologies change and knowledge evolves, it is time consuming to maintain stable mappings to external data sources that are in relational format. We propose a method for interfacing ontology models with data acquisition from external relational data sources. This method uses a declarative interface between the ontology and the data source, and this interface is modeled in the ontology and implemented using XML schema. Data is imported from the relational source into the ontology using XML, and data integrity is checked by validating the XML submission with an XML schema. We have implemented this approach in PharmGKB (, a pharmacogenetics knowledge base. Our goals were to (1) import genetic sequence data, collected in relational format, into the pharmacogenetics ontology, and (2) automate the process of updating the links between the ontology and data acquisition when the ontology changes. We tested our approach by linking PharmGKB with data acquisition from a relational model of genetic sequence information. The ontology subsequently evolved, and we were able to rapidly update our interface with the external data and continue acquiring the data. Similar approaches may be helpful for integrating other heterogeneous information sources in order make the diversity of pharmacogenetics data amenable to computational analysis.

    View details for PubMedID 11928521

  • Representing genetic sequence data for pharmacogenomics: an evolutionary approach using ontological and relational models. Bioinformatics Rubin, D. L., Shafa, F., Oliver, D. E., Hewett, M., Altman, R. B. 2002; 18: S207-15


    The information model chosen to store biological data affects the types of queries possible, database performance, and difficulty in updating that information model. Genetic sequence data for pharmacogenetics studies can be complex, and the best information model to use may change over time. As experimental and analytical methods change, and as biological knowledge advances, the data storage requirements and types of queries needed may also change.We developed a model for genetic sequence and polymorphism data, and used XML Schema to specify the elements and attributes required for this model. We implemented this model as an ontology in a frame-based representation and as a relational model in a database system. We collected genetic data from two pharmacogenetics resequencing studies, and formulated queries useful for analysing these data. We compared the ontology and relational models in terms of query complexity, performance, and difficulty in changing the information model. Our results demonstrate benefits of evolving the schema for storing pharmacogenetics data: ontologies perform well in early design stages as the information model changes rapidly and simplify query formulation, while relational models offer improved query speed once the information model and types of queries needed stabilize.

    View details for PubMedID 12169549

  • Ontology development for a pharmacogenetics knowledge base. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing Oliver, D. E., Rubin, D. L., Stuart, J. M., Hewett, M., Klein, T. E., Altman, R. B. 2002: 65-76


    Research directed toward discovering how genetic factors influence a patient's response to drugs requires coordination of data produced from laboratory experiments, computational methods, and clinical studies. A public repository of pharmacogenetic data should accelerate progress in the field of pharmacogenetics by organizing and disseminating public datasets. We are developing a pharmacogenetics knowledge base (PharmGKB) to support the storage and retrieval of both experimental data and conceptual knowledge. PharmGKB is an Internet-based resource that integrates complex biological, pharmacological, and clinical data in such a way that researchers can submit their data and users can retrieve information to investigate genotype-phenotype correlations. Successful management of the names, meaning, and organization of concepts used within the system is crucial. We have selected a frame-based knowledge-representation system for development of an ontology of concepts and relationships that represent the domain and that permit storage of experimental data. Preliminary experience shows that the ontology we have developed for gene-sequence data allows us to accept, store, and query data submissions.

    View details for PubMedID 11928517

  • Integrating genotype and phenotype information: an overview of the PharmGKB project. Pharmacogenetics Research Network and Knowledge Base. pharmacogenomics journal Klein, T. E., Chang, J. T., Cho, M. K., Easton, K. L., FERGERSON, R., Hewett, M., Lin, Z., Liu, Y., Liu, S., Oliver, D. E., Rubin, D. L., SHAFA, F., Stuart, J. M., Altman, R. B. 2001; 1 (3): 167-170

    View details for PubMedID 11908751



    The authors discuss the influence of viscosity on the imaging properties of WIN 39996 suspension. WIN 39996 suspension is a magnetically susceptible iron ferrite that provides negative (darkening) contrast enhancement in magnetic resonance imaging of the gastrointestinal tract.The viscosity of WIN 39996 suspension was altered by various stress conditions (1 week to 4.5 months storage at temperatures of 5 degrees to 70 degrees C) or by various amounts of xanthan gum. Magnetic resonance imaging was performed in vitro on phantoms and in vivo on the gastrointestinal tract of anesthetized dogs.The results indicated that in vitro and in vivo imaging efficacies of WIN 39996 suspension depended on the viscosity, irrespective of the means by which the viscosity was altered. Specifically, the imaging quality was suitable at viscosities > or = 36.6 cp for in vitro imaging, and > 25 cp for in vivo imaging. The lower in vivo viscosity limit for magnetic resonance imaging compared with the in vitro limit may be due to gastrointestinal peristaltic activities continuously mixing the WIN 39996 suspension to prevent gravitational settling, and the enhancement of signal blackening by intraluminal WIN 39996 that was above and below the plane of image.It is speculated that the imaging quality of WIN 39996 suspension depends on the degree of dispersion of the magnetically susceptible iron ferrite in the WIN 39996 suspension, and that a minimum viscosity is needed to ensure such dispersion.

    View details for Web of Science ID A1995RA15600005

    View details for PubMedID 7635672

  • DYNAMICS OF TUMOR IMAGING WITH GD-DTPA POLYETHYLENE-GLYCOL POLYMERS - DEPENDENCE ON MOLECULAR-WEIGHT JOURNAL OF MAGNETIC RESONANCE IMAGING Desser, T. S., Rubin, D. L., Muller, H. H., Qing, F., KHODOR, S., Zanazzi, G., Young, S. W., Ladd, D. L., WELLONS, J. A., Kellar, K. E., Toner, J. L., Snow, R. A. 1994; 4 (3): 467-472


    Macromolecular contrast media offer potential advantages over freely diffusible agents in magnetic resonance (MR) imaging outside the central nervous system. To identify an optimum molecular weight for macromolecular contrast media, the authors studied a novel macromolecular contrast agent, gadolinium diethylenetriaminepentaacetic acid polyethylene glycol (DTPA-PEG), synthesized in seven polymer (average) molecular weights ranging from 10 to 83 kd. Twenty-eight rabbits bearing V2 carcinoma in thighs underwent T1-weighted spin-echo imaging before injection and 5-60 minutes and 24 hours after injection of the Gd-DTPA-PEG polymers or Gd-DTPA at a gadolinium dose of 0.1 mmol/kg. Tumor region-of-interest measurements were obtained at each time point to determine contrast enhancement dynamics. Blood-pool enhancement dynamics were observed for the Gd-DTPA-PEG polymers larger than 20 kd. Polymers smaller than 20 kd displayed dynamics similar to those of the freely diffusible agent Gd-DTPA. Above the 20 kd threshold, tumor enhancement was more rapid for smaller polymers. The authors conclude that the 21.9-kd Gd-DTPA-PEG polymer is best suited for clinical MR imaging.

    View details for Web of Science ID A1994NP29200033

    View details for PubMedID 8061449



    Magnetically susceptible iron oxide (MSIO) contrast agents for magnetic resonance imaging (MRI) of the gastrointestinal (GI) tract are limited because they produce magnetic susceptibility artifacts. To determine whether oral magnetic particles (WIN 39996) can be an effective MRI contrast agent without producing induced image artifacts, we optimized a liquid formulation of WIN 39996.A range of concentrations (25-250 micrograms iron/mL) and viscosities (1-600 cP) was imaged in a phantom at 1.5 T using conventional spin-echo and gradient-recalled echo pulse sequences. Some formulations also contained titanium.All concentrations of WIN 39996 at 1 cP produced susceptibility artifacts. For formulations in the 150 to 600 cP range, the 125 to 150 micrograms/mL concentrations produced signal blackening and magnetic susceptibility image distortion comparable to an air control. Concentrations greater than 150 micrograms/mL were unacceptable because they produced significant susceptibility artifacts, while concentrations less than 125 micrograms/mL were undesirable because they produced insufficient signal blackening.These preliminary in-vitro studies suggest that an optimized liquid formulation of WIN 39996 can be produced that yields excellent negative contrast without producing image artifacts.

    View details for Web of Science ID A1994NA65700013

    View details for PubMedID 8144343



    Recent in vitro studies suggested there is an optimal range of concentration and viscosity for a liquid formulation of oral magnetic particles (WIN 39996) for magnetic resonance (MR) imaging of the gastrointestinal (GI) tract. To determine whether this formulation is also effective in vivo and whether differing viscosity and administration regimen affect GI distribution of the contrast agent, a range of concentrations of iron (75, 150, and 200 micrograms/mL) and viscosities (1, 150, and 600 cp) were imaged in dogs at 1.5 T with conventional spin-echo and fat-saturation pulse sequences. The effects of dose regimen (single vs divided dose) and subject position (supine vs right lateral decubitus) were also studied. The 75 and 200 micrograms/mL concentrations were unacceptable for MR imaging, while 150 micrograms/mL was effective. The GI distribution of the contrast agent was affected jointly by viscosity, subject position, and dose regimen. The 150 micrograms/mL formulation produced excellent GI contrast enhancement in vivo for both 150- and 600-cp viscosities. The choice of optimal viscosity may depend on the preferred administration regimen.

    View details for Web of Science ID A1993KJ72500016

    View details for PubMedID 8428076



    Complete and homogeneous distribution of gastrointestinal (GI) contrast media are important factors for their effective use in computed tomography as well as in magnetic resonance (MR) imaging. A radiographic method (using fluoroscopy or spot films) could be effective for monitoring intestinal filling with GI contrast agents for MR imaging (GICMR), but it would require the addition of a radiopaque agent to most GICMR. This study was conducted to determine the minimum amount of barium additive necessary to be radiographically visible and to evaluate whether this additive influences the signal characteristics of the GICMR. A variety of barium sulfate preparations (3-12% wt/vol) were tested in dogs to determine the minimum quantity needed to make the administered agent visible during fluoroscopy and on abdominal radiographs. Solutions of 10 different potential GI contrast agents (Gd-DTPA, ferric ammonium citrate, Mn-DPDP, chromium-EDTA, gadolinium-oxalate, ferrite particles, water, mineral oil, lipid emulsion, and methylcellulose) were prepared without ("nondoped") and with ("doped") the barium sulfate additive. MR images of the solutions in tubes were obtained at 0.38 T using 10 different spin-echo pulse sequences. Region of interest (ROI) measurements of contrast agent signal intensity (SI) were made. In addition, for the paramagnetic contrast media, the longitudinal and transverse relaxivity (R1 and R2) were measured. A 6% wt/vol suspension of barium was the smallest concentration yielding adequate radiopacity in the GI tract. Except for gadolinium-oxalate, there was no statistically significant difference in SI for doped and non-doped solutions with most pulse sequences used. In addition, the doped and nondoped solutions yielded R1 and R2 values which were comparable. We conclude that barium sulfate 6% wt/vol added to MR contrast agents produces a suspension with sufficient radiodensity to be viewed radiographically, and it does not cause significant alteration in the MR signal appearance of most GICMR. These formulations can be useful for achieving optimal filling of the gastrointestinal tract prior to MRI.

    View details for Web of Science ID A1992HA59900015

    View details for PubMedID 1734177



    Efforts to develop satisfactory intraluminal gastrointestinal contrast agents for magnetic resonance (MR) imaging have focused on depicting only the bowel lumen to exclude possible involvement by a pathologic process. To determine whether the bowel wall can be adequately imaged with use of the contrast agent and whether bowel wall visualization is a better index of the utility of the contrast agent for MR imaging, perfluoroocytlbromide (PFOB) was studied in human subjects. Twenty consecutive patients referred for abdominal or pelvic MR imaging were selected. All patients were given 400-1,000 mL of PFOB orally. MR imaging was performed at 0.38 and 1.5 T with T1- and T2-weighted spin-echo pulse sequences before and after administration of PFOB. The images were graded independently by three blinded readers. All readers reported significantly superior conspicuity of the bowel lumen and wall after PFOB than before PFOB administration (P less than .002). Among the post-PFOB studies, those with superior bowel wall visualization demonstrated superior overall image quality. In three patients, lesions were optimally demonstrated because the relationship of the process to the bowel wall, rather than just to the lumen, was identified. In two patients, masses arising within the bowel wall could be identified prospectively only when the bowel wall was adequately imaged. The authors conclude that while lumen identification is improved with PFOB, its greatest clinical utility may be in facilitating intestinal wall visualization.

    View details for Web of Science ID A1991HA76500013

    View details for PubMedID 1802151



    Comparison of the effectiveness of various gastrointestinal (GI) contrast agents for magnetic resonance (MR) imaging is often complicated by varying amounts intraluminal filling with the orally administered agents. To achieve more uniform and reproducible imaging results with GI contrast agents for MR imaging (GICMR), we evaluated a radiographic method for monitoring intraluminal filling and distribution. Solutions of Mn-DPDP (2 mM), to which a small amount of barium sulfate (6 wt/vol%) was added, were administered orally to dogs. Gastric emptying and small bowel transit were monitored fluoroscopically. MR imaging was performed either 1) at a fixed time after administration of the contrast agent or 2) at a variable interval when the contrast agent was observed fluoroscopically to be in the terminal ileum. When initiation of MR imaging was guided by fluoroscopic monitoring of intestinal contrast distribution, uniform and reproducible intestinal contrast enhancement by GICMR was achieved. However, when MR imaging was performed at a fixed time interval after oral administration, non-uniform and variable GI visualization was obtained, and this corresponded to the variable intestinal distribution observed fluoroscopically. We conclude that reproducible intestinal filling with orally administered contrast agents can be accomplished with a radiographic monitoring technique, and this promotes more consistent GI visualization on MR images. Such standardized and reproducible methods are necessary for studies in which the effectiveness of GI contrast media for MR imaging is evaluated and compared.

    View details for Web of Science ID A1991FW09600006

    View details for PubMedID 1908931



    Newly developed ferromagnetic catheters (Fe-Caths) are more conspicuous than conventional radiographic catheters (Rad-Caths) on magnetic resonance (MR) images because they produce recognizable ferromagnetic signal patterns (FSPs). To determine how MRI parameters influence these patterns, the imaging characteristics of nine Fe-Caths (ferromagnetic concentration 0.01 to 1.0 weight/weight %) were studied systematically and compared with three Rad-Caths. All catheters were studied in stationary and moving phantoms at mid-field (0.38 T) and high-field (1.5 T) strength using spin-echo and gradient-echo pulse sequences. Rad-Caths always produced a signal void. Fe-Caths produced FSPs, the size of which depended on the orientation of the catheter with respect to the main magnetic field, the concentration of ferromagnetic agent in the catheter, and the direction and strength of the frequency encoding gradient. When Fe-Caths were positioned perpendicular to the main magnetic field, they produced FSPs; however, when they were parallel to the main magnetic field, Fe-Caths produced no FSP, thus having a similar appearance to the Rad-Caths. Ferromagnetic catheters produce conspicuous patterns on MR images that depend on catheter orientation in the main magnetic field and vary predictably with the MRI parameters.

    View details for Web of Science ID A1990EM20500007

    View details for PubMedID 2279913



    A new hepatobiliary contrast agent (Mn-DPDP) was used in the detection of liver metastases in six rabbits with seven hepatic V2 carcinomas. This contrast agent is derived from pyridoxyl-5-phosphate which is biomimetically designed to be secreted by the hepatocyte. After Mn-DPDP administration, a 105% increase in liver signal to noise was obtained using a 200/20 (TR/TE) pulsing sequence, and a 62% decrease in intensity was observed using a 1200/60 pulsing sequence. Liver V2 carcinoma contrast enhancement increased 427% using the 200/20 pulsing sequence and 176% using the 1200/60 pulsing sequence. Four of seven V2 carcinomas were not detectable prior to the administration of Mn-DPDP (50 mumol/kg). Two neoplasms were only detectable in retrospect (after Mn-DPDP) on the 1200/60 sequence. The smallest neoplasms detected in this study were 1-4 mm. Mn-DPDP appears to be a promising MRI contrast agent.

    View details for Web of Science ID A1990DL77400011

    View details for PubMedID 2114511



    Rotaviruses are icosahedral viruses with a segmented, double-stranded RNA genome. They are the major cause of severe infantile infectious diarrhea. Rotavirus growth in tissue culture is markedly enhanced by pretreatment of virus with trypsin. Trypsin activation is associated with cleavage of the viral hemagglutinin (viral protein 3 [VP3]; 88 kilodaltons) into two fragments (60 and 28 kilodaltons). The mechanism by which proteolytic cleavage leads to enhanced growth is unknown. Cleavage of VP3 does not alter viral binding to cell monolayers. In previous electron microscopic studies of infected cell cultures, it has been demonstrated that rotavirus particles enter cells by both endocytosis and direct cell membrane penetration. To determine whether trypsin treatment affected rotavirus internalization, we studied the kinetics of entry of infectious rhesus rotavirus (RRV) into MA104 cells. Trypsin-activated RRV was internalized with a half-time of 3 to 5 min, while nonactivated virus disappeared from the cell surface with a half-time of 30 to 50 min. In contrast to trypsin-activated RRV, loss of nonactivated RRV from the cell surface did not result in the appearance of infection, as measured by plaque formation. Endocytosis inhibitors (sodium azide, dinitrophenol) and lysosomotropic agents (ammonium chloride, chloroquine) had a limited effect on the entry of infectious virus into cells. Purified trypsin-activated RRV added to cell monolayers at pH 7.4 medicated 51Cr, [14C]choline, and [3H]inositol released from prelabeled MA104 cells. This release could be specifically blocked by neutralizing antibodies to VP3. These results suggest that MA104 cell infection follows the rapid entry of trypsin-activated RRV by direct cell membrane penetration. Cell membrane penetration of infectious RRV is initiated by trypsin cleavage of VP3. Neutralizing antibodies can inhibit this direct membrane penetration.

    View details for Web of Science ID A1988M444000007

    View details for PubMedID 2831376

  • PULMONARY-FUNCTION IN ADVANCED PULMONARY-HYPERTENSION THORAX Burke, C. M., Glanville, A. R., MORRIS, A. J., Rubin, D., Harvey, J. A., Theodore, J., Robin, E. D. 1987; 42 (2): 131-135


    Pulmonary mechanical function and gas exchange were studied in 33 patients with advanced pulmonary vascular disease, resulting from primary pulmonary hypertension in 18 cases and from Eisenmenger physiology in 15 cases. Evidence of airway obstruction was found in most patients. In addition, mean total lung capacity (TLC) was only 81.5% of predicted and 27% of our subjects had values of TLC less than one standard deviation below the mean predicted value. The mean value for transfer factor (TLCO) was 71.8% of predicted and appreciable arterial hypoxaemia was present, which was disproportionate to the mild derangements in pulmonary mechanics. Patients with Eisenmenger physiology had significantly lower values of arterial oxygen tension (PaO2) (p less than 0.05) and of maximum mid expiratory flow (p less than 0.05) and significantly higher pulmonary arterial pressure (p less than 0.05) than those with primary pulmonary hypertension, but no other variables were significantly different between the two subpopulations. It is concluded that advanced pulmonary vascular disease in patients with primary pulmonary hypertension and Eisenmenger physiology is associated not only with severe hypoxaemia but also with altered pulmonary mechanical function.

    View details for Web of Science ID A1987F940800010

    View details for PubMedID 3433237

Conference Proceedings

  • Computing Human Image Annotation Channin, D. S., Mongkolwat, P., Kleper, V., Rubin, D. L. IEEE. 2009: 7065-7068


    An image annotation is the explanatory or descriptive information about the pixel data of an image that is generated by a human (or machine) observer. An image markup is the graphical symbols placed over the image to depict an annotation. In the majority of current, clinical and research imaging practice, markup is captured in proprietary formats and annotations are referenced only in free text radiology reports. This makes these annotations difficult to query, retrieve and compute upon, hampering their integration into other data mining and analysis efforts. This paper describes the National Cancer Institute's Cancer Biomedical Informatics Grid's (caBIG) Annotation and Image Markup (AIM) project, focusing on how to use AIM to query for annotations. The AIM project delivers an information model for image annotation and markup. The model uses controlled terminologies for important concepts. All of the classes and attributes of the model have been harmonized with the other models and common data elements in use at the National Cancer Institute. The project also delivers XML schemata necessary to instantiate AIMs in XML as well as a software application for translating AIM XML into DICOM S/R and HL7 CDA. Large collections of AIM annotations can be built and then queried as Grid or Web services. Using the tools of the AIM project, image annotations and their markup can be captured and stored in human and machine readable formats. This enables the inclusion of human image observation and inference as part of larger data mining and analysis activities.

    View details for Web of Science ID 000280543605223

    View details for PubMedID 19964202

  • A Bayesian network for mammography Burnside, E., Rubin, D., Shachter, R. HANLEY & BELFUS INC. 2000: 106-110


    The interpretation of a mammogram and decisions based on it involve reasoning and management of uncertainty. The wide variation of training and practice among radiologists results in significant variability in screening performance with attendant cost and efficacy consequences. We have created a Bayesian belief network to integrate the findings on a mammogram, based on the standardized lexicon developed for mammography, the Breast Imaging Reporting And Data System (BI-RADS). Our goal in creating this network is to explore the probabilistic underpinnings of this lexicon as well as standardize mammographic decision-making to the level of expert knowledge.

    View details for Web of Science ID 000170207500023

    View details for PubMedID 11079854

  • NANOPARTICULATE CONTRAST-MEDIA - BLOOD-POOL AND LIVER-SPLEEN IMAGING Rubin, D. L., Desser, T. S., Qing, F., Muller, H. H., Young, S. W., McIntire, G. L., Bacon, E., Cooper, E., Toner, J. LIPPINCOTT WILLIAMS & WILKINS. 1994: S280-S283

    View details for Web of Science ID A1994NX79500096

    View details for PubMedID 7928256


    View details for Web of Science ID A1994NX79500022

    View details for PubMedID 7928274

Stanford Medicine Resources: