Comparison of Orthogonal NLP Methods for Clinical Phenotyping and Assessment of Bone Scan Utilization among Prostate Cancer Patients.
Journal of biomedical informatics
Clinical and Metabolic Correlates of Calcium Oxalate Stone Subtypes: Implications for Etiology and Management.
Journal of endourology
Clinical care guidelines recommend that newly diagnosed prostate cancer patients at high risk for metastatic spread receive a bone scan prior to treatment and that low risk patients not receive it. The objective was to develop an automated pipeline to interrogate heterogeneous data to evaluate the use of bone scans using a two different Natural Language Processing (NLP) approaches.Our cohort was divided into risk groups based on Electronic Health Records (EHR). Information on bone scan utilization was identified in both structured data and free text from clinical notes. Our pipeline annotated sentences with a combination of a rule-based method using the ConText algorithm (a generalization of NegEx) and a Convolutional Neural Network (CNN) method using word2vec to produce word embeddings.A total of 5,500 patients and 369,764 notes were included in the study. A total of 39% of patients were high-risk and 73% of these received a bone scan; of the 18% low risk patients, 10% received one. The accuracy of CNN model outperformed the rule-based model one (F-measure = 0.918 and 0.897 respectively). We demonstrate a combination of both models could maximize precision or recall, based on the study question.Using structured data, we accurately classified patients' cancer risk group, identified bone scan documentation with two NLP methods, and evaluated guideline adherence. Our pipeline can be used to provide concrete feedback to clinicians and guide treatment decisions.
View details for PubMedID 31014980
Is it possible to automatically assess pretreatment digital rectal examination documentation using natural language processing? A single-centre retrospective study.
2019; 9 (7): e027182
Calcium oxalate (CaOx) is the predominate component within renal calculi and can be divided into two subtypes: CaOx-monohydrate (COM) and CaOx-dihydrate (COD). COM and COD form in differing urinary environments which suggest differential underlying metabolic abnormalities associated with each subtype. We compared clinical and metabolic findings in CaOx stone-formers to delineate factors differentiating COD and COM stone formers and the implication this holds in terms of etiology and treatment.We identified CaOx stone-formers that had passed their stones or had undergone endoscopic extraction between October 2014 and December 2018. Only patients who had a predominant subtype (≥ 80% COM or COD) and who had a 24-hour urine evaluation prior to medical management were included. Clinical and metabolic factors were compared in the two subgroups.Out of 157 stone-formers, 121 were COM and 36 were COD. COD formers were younger than COM formers with a mean age of 53±16 vs 59±15, respectively (p=0.038). There were no observable differences in gender, BMI, HTN, DM, or HLD. COM formers exhibited higher rates of hypocitraturia and hyperoxaluria, p= 0.022 and p=0.018, respectively. Conversely, COD formers had significantly higher rates of hypercalciuria (47% vs. 28%, p=0.012). Multivariate analysis found hypercalciuria to independently predict COD (p=0.043) and hyperoxaluria to predict COM stones (p=0.016).COM formers are more likely to have hyperoxaluria, hypocitraturia, and elevated urinary oxalate levels compared to COD formers. COD formers exhibited higher incidence of hypercalciuria. These data suggest that all CaOx stones are not alike, and that distinct metabolic and clinical etiological differences exist that may guide future management and prevention.
View details for DOI 10.1089/end.2019.0245
View details for PubMedID 31154910
An Automated Feature Engineering for Digital Rectal Examination Documentation using Natural Language Processing.
AMIA ... Annual Symposium proceedings. AMIA Symposium
2018; 2018: 288–94
To develop and test a method for automatic assessment of a quality metric, provider-documented pretreatment digital rectal examination (DRE), using the outputs of a natural language processing (NLP) framework.An electronic health records (EHR)-based prostate cancer data warehouse was used to identify patients and associated clinical notes from 1 January 2005 to 31 December 2017. Using a previously developed natural language processing pipeline, we classified DRE assessment as documented (currently or historically performed), deferred (or suggested as a future examination) and refused.We investigated the quality metric performance, documentation 6 months before treatment and identified patient and clinical factors associated with metric performance.The cohort included 7215 patients with prostate cancer and 426 227 unique clinical notes associated with pretreatment encounters. DREs of 5958 (82.6%) patients were documented and 1257 (17.4%) of patients did not have a DRE documented in the EHR. A total of 3742 (51.9%) patient DREs were documented within 6 months prior to treatment, meeting the quality metric. Patients with private insurance had a higher rate of DRE 6 months prior to starting treatment as compared with Medicaid-based or Medicare-based payors (77.3%vs69.5%, p=0.001). Patients undergoing chemotherapy, radiation therapy or surgery as the first line of treatment were more likely to have a documented DRE 6 months prior to treatment.EHRs contain valuable unstructured information and with NLP, it is feasible to accurately and efficiently identify quality metrics with current documentation clinician workflow.
View details for DOI 10.1136/bmjopen-2018-027182
View details for PubMedID 31324681
Digital rectal examination (DRE) is considered a quality metric for prostate cancer care. However, much of the DRE related rich information is documented as free-text in clinical narratives. Therefore, we aimed to develop a natural language processing (NLP) pipeline for automatic documentation of DRE in clinical notes using a domain-specific dictionary created by clinical experts and an extended version of the same dictionary learned by clinical notes using distributional semantics algorithms. The proposed pipeline was compared to a baseline NLP algorithm and the results of the proposed pipeline were found superior in terms of precision (0.95) and recall (0.90) for documentation of DRE. We believe the rule-based NLP pipeline enriched with terms learned from the whole corpus can provide accurate and efficient identification of this quality metric.
View details for PubMedID 30815067