Jose Posada holds a Ph.D. and master’s degree in Biomedical Informatics from the University of Pittsburgh. He is highly skilled in the use of EHR data to answer clinical and healthcare operational questions. He co-leads the development and support of the Stanford Clinical Data Warehouse STARR-OMOP where he is responsible for increasing and ensuring the data quality of the database and leading the participation of Stanford in multicentric multinational clinical studies. His current research and development efforts focus mostly on clinical text. He led the scientific design of two clinical text pipelines, one to de-identify millions of clinical notes (TiDE) and the other to extract concepts from biomedical ontologies from them. He is uniquely qualified for developing and implementing state of the art clinical natural language processing algorithms to answer clinical questions.

Current Role at Stanford

Sr. Clinical Data Scientist

Honors & Awards

  • Fulbright Fellowship for PhD Studies, Fulbright (2014)

Education & Certifications

  • PhD, University of Pittsburgh, Biomedical Informatics (2018)
  • MSc, University of Pittsburgh, Biomedical Informatics (2016)
  • MSc, Universidad del Norte, Mechanical Engineering (2009)
  • Engineer, Universidad del Norte, Electronics Engineering (2007)


  • M. E. Sanjuan, J. R. Garcia, J. D. Posada, P. J. Villalba,. "United States Patent 8,390,446 Method and apparatus for on-line estimation and forecasting of species concentration during a reaction by measuring electrical conductivity", Universidad del Norte, Jul 1, 2013
  • M. E. Sanjuan, J. R. Garcia, J. D. Posada, P. J. Villalba,. "Germany Patent EP 2350628 B1 20130717 Method and apparatus for on-line estimation and forecasting of species concentration during a reaction", Universidad del Norte, Jul 1, 2013


Professional Interests

Clinical Natural Language Processing
Artificial Intelligence in Medicine
Electronic Health Records Data Quality

Work Experience

  • Sr. Clinical Data Scientist, Stanford University (September 15, 2018 - Present)


    Redwood City

  • Graduate Student Researcher, University of Pittsburgh (August 1, 2016 - August 31, 2018)



  • Full time professor, Universidad Autonoma del Caribe (January 1, 2011 - August 31, 2018)



  • Research and Teaching Assistant, Universidad del Norte (January 1, 2007 - December 31, 2010)




All Publications

  • Ontology-driven weak supervision for clinical entity classification in electronic health records. Nature communications Fries, J. A., Steinberg, E., Khattar, S., Fleming, S. L., Posada, J., Callahan, A., Shah, N. H. 2021; 12 (1): 2017


    In the electronic health record, using clinical notes to identify entities such as disorders and their temporality (e.g. the order of an event relative to a time index) can inform many important analyses. However, creating training data for clinical entity tasks is time consuming and sharing labeled data is challenging due to privacy concerns. The information needs of the COVID-19 pandemic highlight the need for agile methods of training machine learning models for clinical notes. We present Trove, a framework for weakly supervised entity classification using medical ontologies and expert-generated rules. Our approach, unlike hand-labeled notes, is easy to share and modify, while offering performance comparable to learning from manually labeled training data. In this work, we validate our framework on six benchmark tasks and demonstrate Trove's ability to analyze the records of patients visiting the emergency department at Stanford Health Care for COVID-19 presenting symptoms and risk factors.

    View details for DOI 10.1038/s41467-021-22328-4

    View details for PubMedID 33795682

  • ACE: the Advanced Cohort Engine for searching longitudinal patient records. Journal of the American Medical Informatics Association : JAMIA Callahan, A., Polony, V., Posada, J. D., Banda, J. M., Gombar, S., Shah, N. H. 2021


    OBJECTIVE: To propose a paradigm for a scalable time-aware clinical data search, and to describe the design, implementation and use of a search engine realizing this paradigm.MATERIALS AND METHODS: The Advanced Cohort Engine (ACE) uses a temporal query language and in-memory datastore of patient objects to provide a fast, scalable, and expressive time-aware search. ACE accepts data in the Observational Medicine Outcomes Partnership Common Data Model, and is configurable to balance performance with compute cost. ACE's temporal query language supports automatic query expansion using clinical knowledge graphs. The ACE API can be used with R, Python, Java, HTTP, and a Web UI.RESULTS: ACE offers an expressive query language for complex temporal search across many clinical data types with multiple output options. ACE enables electronic phenotyping and cohort-building with subsecond response times in searching the data of millions of patients for a variety of use cases.DISCUSSION: ACE enables fast, time-aware search using a patient object-centric datastore, thereby overcoming many technical and design shortcomings of relational algebra-based querying. Integrating electronic phenotype development with cohort-building enables a variety of high-value uses for a learning health system. Tradeoffs include the need to learn a new query language and the technical setup burden.CONCLUSION: ACE is a tool that combines a unique query language for time-aware search of longitudinal patient records with a patient object datastore for rapid electronic phenotyping, cohort extraction, and exploratory data analyses.

    View details for DOI 10.1093/jamia/ocab027

    View details for PubMedID 33712854

  • Electrochemical Immunosensor for the Quantification of S100B at Clinically Relevant Levels Using a Cysteamine Modified Surface. Sensors (Basel, Switzerland) Rodriguez, A., Burgos-Florez, F., Posada, J. D., Cervera, E., Zucolotto, V., Sanjuan, H., Sanjuan, M., Villalba, P. J. 2021; 21 (6)


    Neuronal damage secondary to traumatic brain injury (TBI) is a rapidly evolving condition, which requires therapeutic decisions based on the timely identification of clinical deterioration. Changes in S100B biomarker levels are associated with TBI severity and patient outcome. The S100B quantification is often difficult since standard immunoassays are time-consuming, costly, and require extensive expertise. A zero-length cross-linking approach on a cysteamine self-assembled monolayer (SAM) was performed to immobilize anti-S100B monoclonal antibodies onto both planar (AuEs) and interdigitated (AuIDEs) gold electrodes via carbonyl-bond. Surface characterization was performed by atomic force microscopy (AFM) and specular-reflectance FTIR for each functionalization step. Biosensor response was studied using the change in charge-transfer resistance (Rct) from electrochemical impedance spectroscopy (EIS) in potassium ferrocyanide, with [S100B] ranging 10-1000 pg/mL. A single-frequency analysis for capacitances was also performed in AuIDEs. Full factorial designs were applied to assess biosensor sensitivity, specificity, and limit-of-detection (LOD). Higher Rct values were found with increased S100B concentration in both platforms. LODs were 18 pg/mL(AuES) and 6 pg/mL(AuIDEs). AuIDEs provide a simpler manufacturing protocol, with reduced fabrication time and possibly costs, simpler electrochemical response analysis, and could be used for single-frequency analysis for monitoring capacitance changes related to S100B levels.

    View details for DOI 10.3390/s21061929

    View details for PubMedID 33801798

  • COVID-19 in patients with autoimmune diseases: characteristics and outcomes in a multinational network of cohorts across three countries. Rheumatology (Oxford, England) Tan, E. H., Sena, A. G., Prats-Uribe, A. n., You, S. C., Ahmed, W. U., Kostka, K. n., Reich, C. n., Duvall, S. L., Lynch, K. E., Matheny, M. E., Duarte-Salles, T. n., Bertolin, S. F., Hripcsak, G. n., Natarajan, K. n., Falconer, T. n., Spotnitz, M. n., Ostropolets, A. n., Blacketer, C. n., Alshammari, T. M., Alghoul, H. n., Alser, O. n., Lane, J. C., Dawoud, D. M., Shah, K. n., Yang, Y. n., Zhang, L. n., Areia, C. n., Golozar, A. n., Recalde, M. n., Casajust, P. n., Jonnagaddala, J. n., Subbian, V. n., Vizcaya, D. n., Lai, L. Y., Nyberg, F. n., Morales, D. R., Posada, J. D., Shah, N. H., Gong, M. n., Vivekanantham, A. n., Abend, A. n., Minty, E. P., Suchard, M. n., Rijnbeek, P. n., Ryan, P. B., Prieto-Alhambra, D. n. 2021


    Patients with autoimmune diseases were advised to shield to avoid COVID-19, but information on their prognosis is lacking. We characterised 30-day outcomes and mortality after hospitalisation with COVID-19 among patients with prevalent autoimmune diseases, and compared outcomes after hospital admissions among similar patients with seasonal influenza.A multinational network cohort study was conducted using electronic health records data from Columbia University Irving Medical Center (CUIMC) (United States [US]), Optum [US], Department of Veterans Affairs (VA) (US), Information System for Research in Primary Care-Hospitalisation Linked Data (SIDIAP-H) (Spain), and claims data from IQVIA Open Claims (US) and Health Insurance and Review Assessment (HIRA) (South Korea). All patients with prevalent autoimmune diseases, diagnosed and/or hospitalised between January and June 2020 with COVID-19, and similar patients hospitalised with influenza in 2017-2018 were included. Outcomes were death and complications within 30 days of hospitalisation.We studied 133 589 patients diagnosed and 48 418 hospitalised with COVID-19 with prevalent autoimmune diseases. Most patients were female, aged ≥50 years with previous comorbidities. The prevalence of hypertension (45.5-93.2%), chronic kidney disease (14.0-52.7%) and heart disease (29.0-83.8%) was higher in hospitalised vs diagnosed patients with COVID-19. Compared with 70 660 hospitalised with influenza, those admitted with COVID-19 had more respiratory complications including pneumonia and acute respiratory distress syndrome, and higher 30-day mortality (2.2% to 4.3% vs 6.3% to 24.6%).Compared with influenza, COVID-19 is a more severe disease, leading to more complications and higher mortality.

    View details for DOI 10.1093/rheumatology/keab250

    View details for PubMedID 33725121

  • Prediction of Major Depressive Disorder Following Beta-Blocker Therapy in Patients with Cardiovascular Diseases. Journal of personalized medicine Jin, S., Kostka, K., Posada, J. D., Kim, Y., Seo, S. I., Lee, D. Y., Shah, N. H., Roh, S., Lim, Y., Chae, S. G., Jin, U., Son, S. J., Reich, C., Rijnbeek, P. R., Park, R. W., You, S. C. 2020; 10 (4)


    Incident depression has been reported to be associated with poor prognosis in patients with cardiovascular disease (CVD), which might be associated with beta-blocker therapy. Because early detection and intervention can alleviate the severity of depression, we aimed to develop a machine learning (ML) model predicting the onset of major depressive disorder (MDD). A model based on L1 regularized logistic regression was trained against the South Korean nationwide administrative claims database to identify risk factors for the incident MDD after beta-blocker therapy in patients with CVD. We identified 50,397 patients initiating beta-blockers for CVD, with 774 patients developing MDD within 365 days after initiating beta-blocker therapy. An area under the receiver operating characteristic curve (AUC) of 0.74 was achieved. A history of non-selective beta-blockers and factors related to anxiety disorder, sleeping problems, and other chronic diseases were the most strong predictors. AUCs of 0.62-0.71 were achieved in the external validation conducted on six independent electronic health records and claims databases in the USA and South Korea. In conclusion, an ML model that identifies patients at high-risk for incident MDD was developed. Application of ML to identify susceptible patients for adverse events of treatment may serve as an important approach for personalized medicine.

    View details for DOI 10.3390/jpm10040288

    View details for PubMedID 33352870

  • Deep phenotyping of 34,128 adult patients hospitalised with COVID-19 in an international network study. Nature communications Burn, E., You, S. C., Sena, A. G., Kostka, K., Abedtash, H., Abrahao, M. T., Alberga, A., Alghoul, H., Alser, O., Alshammari, T. M., Aragon, M., Areia, C., Banda, J. M., Cho, J., Culhane, A. C., Davydov, A., DeFalco, F. J., Duarte-Salles, T., DuVall, S., Falconer, T., Fernandez-Bertolin, S., Gao, W., Golozar, A., Hardin, J., Hripcsak, G., Huser, V., Jeon, H., Jing, Y., Jung, C. Y., Kaas-Hansen, B. S., Kaduk, D., Kent, S., Kim, Y., Kolovos, S., Lane, J. C., Lee, H., Lynch, K. E., Makadia, R., Matheny, M. E., Mehta, P. P., Morales, D. R., Natarajan, K., Nyberg, F., Ostropolets, A., Park, R. W., Park, J., Posada, J. D., Prats-Uribe, A., Rao, G., Reich, C., Rho, Y., Rijnbeek, P., Schilling, L. M., Schuemie, M., Shah, N. H., Shoaibi, A., Song, S., Spotnitz, M., Suchard, M. A., Swerdel, J. N., Vizcaya, D., Volpe, S., Wen, H., Williams, A. E., Yimer, B. B., Zhang, L., Zhuk, O., Prieto-Alhambra, D., Ryan, P. 2020; 11 (1): 5009


    Comorbid conditions appear to be common among individuals hospitalised with coronavirus disease 2019 (COVID-19) but estimates of prevalence vary and little is known about the prior medication use of patients. Here, we describe the characteristics of adults hospitalised with COVID-19 and compare them with influenza patients. We include 34,128 (US: 8362, South Korea: 7341, Spain: 18,425) COVID-19 patients, summarising between 4811 and 11,643 unique aggregate characteristics. COVID-19 patients have been majority male in the US and Spain, but predominantly female in South Korea. Age profiles vary across data sources. Compared to 84,585 individuals hospitalised with influenza in 2014-19, COVID-19 patients have more typically been male, younger, and with fewer comorbidities and lower medication use. While protecting groups vulnerable to influenza is likely a useful starting point in the response to COVID-19, strategies will likely need to be broadened to reflect the particular characteristics of individuals being hospitalised with COVID-19.

    View details for DOI 10.1038/s41467-020-18849-z

    View details for PubMedID 33024121

  • An international characterisation of patients hospitalised with COVID-19 and a comparison with those previously hospitalised with influenza. medRxiv : the preprint server for health sciences Burn, E. n., You, S. C., Sena, A. G., Kostka, K. n., Abedtash, H. n., Abrahão, M. T., Alberga, A. n., Alghoul, H. n., Alser, O. n., Alshammari, T. M., Areia, C. n., Banda, J. M., Cho, J. n., Culhane, A. C., Davydov, A. n., DeFalco, F. J., Duarte-Salles, T. n., DuVall, S. n., Falconer, T. n., Gao, W. n., Golozar, A. n., Hardin, J. n., Hripcsak, G. n., Huser, V. n., Jeon, H. n., Jing, Y. n., Jung, C. Y., Kaas-Hansen, B. S., Kaduk, D. n., Kent, S. n., Kim, Y. n., Kolovos, S. n., Lane, J. C., Lee, H. n., Lynch, K. E., Makadia, R. n., Matheny, M. E., Mehta, P. n., Morales, D. R., Natarajan, K. n., Nyberg, F. n., Ostropolets, A. n., Park, R. W., Park, J. n., Posada, J. D., Prats-Uribe, A. n., Rao, G. n., Reich, C. n., Rho, Y. n., Rijnbeek, P. n., Sathappan, S. M., Schilling, L. M., Schuemie, M. n., Shah, N. H., Shoaibi, A. n., Song, S. n., Spotnitz, M. n., Suchard, M. A., Swerdel, J. N., Vizcaya, D. n., Volpe, S. n., Wen, H. n., Williams, A. E., Yimer, B. B., Zhang, L. n., Zhuk, O. n., Prieto-Alhambra, D. n., Ryan, P. n. 2020


    To better understand the profile of individuals with severe coronavirus disease 2019 (COVID-19), we characterised individuals hospitalised with COVID-19 and compared them to individuals previously hospitalised with influenza.We report the characteristics (demographics, prior conditions and medication use) of patients hospitalised with COVID-19 between December 2019 and April 2020 in the US (Columbia University Irving Medical Center [CUIMC], STAnford Medicine Research data Repository [STARR-OMOP], and the Department of Veterans Affairs [VA OMOP]) and Health Insurance Review & Assessment [HIRA] of South Korea. Patients hospitalised with COVID-19 were compared with patients previously hospitalised with influenza in 2014-19.6,806 (US: 1,634, South Korea: 5,172) individuals hospitalised with COVID-19 were included. Patients in the US were majority male (VA OMOP: 94%, STARR-OMOP: 57%, CUIMC: 52%), but were majority female in HIRA (56%). Age profiles varied across data sources. Prevalence of asthma ranged from 7% to 14%, diabetes from 18% to 43%, and hypertensive disorder from 22% to 70% across data sources, while between 9% and 39% were taking drugs acting on the renin-angiotensin system in the 30 days prior to their hospitalisation. Compared to 52,422 individuals hospitalised with influenza, patients admitted with COVID-19 were more likely male, younger, and, in the US, had fewer comorbidities and lower medication use.Rates of comorbidities and medication use are high among individuals hospitalised with COVID-19. However, COVID-19 patients are more likely to be male and appear to be younger and, in the US, generally healthier than those typically admitted with influenza.

    View details for DOI 10.1101/2020.04.22.20074336

    View details for PubMedID 32511443

    View details for PubMedCentralID PMC7239064

  • Characterizing database granularity using SNOMED-CT hierarchy. AMIA ... Annual Symposium proceedings. AMIA Symposium Ostropolets, A., Reich, C., Ryan, P., Weng, C., Molinaro, A., DeFalco, F., Jonnagaddala, J., Liaw, S., Jeon, H., Park, R. W., Spotnitz, M. E., Natarajan, K., Argyriou, G., Kostka, K., Miller, R., Williams, A., Minty, E., Posada, J., Hripcsak, G. 2020; 2020: 983–92


    Multi-center observational studies require recognition and reconciliation of differences in patient representations arising from underlying populations, disparate coding practices and specifics of data capture. This leads to different granularity or detail of concepts representing the clinical facts. For researchers studying certain populations of interest, it is important to ensure that concepts at the right level are used for the definition of these populations. We studied the granularity of concepts within 22 data sources in the OHDSI network and calculated a composite granularity score for each dataset. Three alternative SNOMED-based approaches for such score showed consistency in classifying data sources into three levels of granularity (low, moderate and high), which correlated with the provenance of data and country of origin. However, they performed unsatisfactorily in ordering data sources within these groups and showed inconsistency for small data sources. Further studies on examining approaches to data source granularity are needed.

    View details for PubMedID 33936474

Latest information on COVID-19