Research IT 2020 Year in Review

Celebrating our team's achievements

A remarkable year in more ways than one

Dec 15, 2020: With shelter-in-place, research pace hasn't reduced. If anything, COVID-19 may have accelerated the pace of research and increased the sense of urgency. With the end of year approaching, we take a moment to celebrate some of the remarkable projects our team got knee deep in.



Real time COVID-19 data in STARR Tools (fka STRIDE)

Multiple teams at TDS came together to add a near real time data stream (via Reporting Workbench) from Epic to Research IT’s STRIDE database, our first generation research clinical data warehouse (CDW), to allow for improved COVID-19 research. The data stream contains live data for patients who undergo a COVID-19 test.

This feature is now deprecated.

Real time patient reporting for sleep clinic

Alliance Sleep Questionnaire (ASQ), a comprehensive and innovative, web-based questionnaire created through a collaborative effort between Stanford University, Harvard University, University of Pennsylvania, University of Wisconsin-Madison, and St. Luke’s Hospital and implemented in our sleep clinic in 2011 has integrated STARR-HL7 feed instead of STARR-Clarity feed, so patient assessments can be delivered to sleep medicine clinicians, enabling clinician to review of patient reports in real time (20% of patients fill out the assessment at their appointment).

REDCap Training

Following up on the success of our REDCap basic training, we started a REDCap intermediate training this year. Both the basic and intermediate curriculum are now available via Stanford Training and Registration System (STARS). Over 350 users have availed themselves of the REDCap training sessions. 

STARR Data Science Training, a Learning Channel on YouTube

We piloted and launched a data science training curriculum around OMOP and Nero. Over 60 students have completed the curriculum. A youtube channel was launched recently scale up our capabilities and to make the training more accessible during shelter-in-place.

Manuscript published on our new paradigm for accelerating clinical data science 

Research IT is at the forefront of cloud use for Clinical Data Warehouse. We published the vision and approach behind our new platform that includes advances such as Cloud adoption, OMOP Common Data Model transformation, advanced approaches to text mining and privacy preserving approaches to clinical text de-identification. The manuscriot is available on the pre-print server so the broader community can benefit from our experience.

Manuscripts published on our image de-identification

Research IT is at the forefront of cloud use for Clinical Data Warehouse. We published our image data de-identification strategies, on the pre-print server so the broader community can benefit from our experience.

mhealth platform integrated with Firebase and CardinalKit

The next generation of the mHealth Platform, adds support for Google's Firestore database via the Firebase SDK, and related services such as identity management. Research IT is collaborating with the Stanford Byers Center for Biodesign to ensure the new Open Source mobile development framework CardinalKit is pre-integrated with the mHealth Platform to make it easier, faster, and cheaper than ever to build new mobile applications for research.


Research IT supported launch of neurocoach study that is designed to increase at-home rehabilitation of stroke patients who suffer from hemiparesis. Hemiparesis (arm and/or leg weakness), affects up to 66% of stroke survivors, and is a major cause of post-stroke disability. This launch is particularly timely during COVID-19 shelter-in-place. Study participants work with their clinical researcher and therapist to manage daily, tailored rehabilitation therapy, entirely through their iPhone.

CHOIR migrated to cloud

Completed migration of the CHOIR learning healthcare system  to the Google Cloud. This release modernizes our CHOIR platform, used by nine different clinical areas at Stanford and serving over 35K patient assessments per year, to take advantage of cloud-native services such as Google's managed Kubernetes and Cloud SQL services. Most benefits of this new infrastructure are invisible to users (geographic redundancy, high-availability, security improvements) but allows for rapid development of new features.

New capabilities introduced in CHOIR

CHOIR has incorporated Red Hat SSO service. End users at the Hospital will notice reduced friction as they will be able to login with their existing credentials from SHC, and SCH.

We introduced a pre/ post Ketamine procedure survey for patients of the Adult Pain clinic. We also implemented a detailed version of the “Adverse Childhood Experience (ACE) Questionnaire”. ACE has profound impact on a number of pediatric and adult health issues including mental health, pain, anxiety, depression and opioid misuse.

Finally, we launched a new feature allowing doctors to access patient health status and outcomes information from CHOIR within the Epic chart. This feature embeds CHOIR as a SMART on FHIR application. It appears to the user as a tab in Epic Hyperspace, and does not require any separate login. Providers can view appointments and the respective survey result of their patient without leaving Epic. This application can be launched from the patient chart window. CHOIR on FHIR current features include listing appointments for a patient, display completed survey results and visualize score history (PROMIS and other measures) in a chart like tool. 

CHOIR launched for a gastroenterology clinic

This is a first go-live of CHOIR within the Division of Gastroenterology & Hepatology department. Digestive Health Center uses CHOIR to send out survey links to patients before their appointments, so that they can complete these assessments at home using their phone or computer, or at the clinic using a provided tablet. Once the survey is completed, the provider can have patient’s health status even before the appointment. 

CHOIR launched in a Psychiatry clinic

We launched CHOIR in the first Psychiatry clinic, the Psychosocial Treatment Clinic. This is our first go-live within the Department of Psychiatry and Behavioral Sciences.

NICU features for High Risk Infant Follow-up program

Research IT released two significant new features in support of HRIF program - a) in the newly released Cardiac Portal, pediatric cardiac units are able to now refer directly into HRIF, bypassing the normal NICU based workflow used by all other referrals; b) a new Electronic Data Submission (EDS) capability, permitting integration with the local EMR, allowing hospitals to automate their NICU referrals and new patient registration, which dramatically reduces manual data entry burden on NICU staff.

The HRIF program is a California state-wide system referring at-risk infants into a three-year state-funded specialty care program. HRIF is a collaboration between Stanford’s Neonatal and Developmental Medicine Division and California Children's Services. At the core of HRIF is Research IT’s database, hosted and supported by the TDS Platform Services team. 

The database supported a new study focused on the smallest and most premature babies, and included ~50,000 infants who were in the NICUs of 143 California hospitals between 2008 and 2017.

Medication app in Epic

Heparin‐induced thrombocytopenia (HIT) and Drug Reaction with Eosinophilia and Systemic Symptoms (DRESS) are two immune mediated adverse drug reactions that can be devastating if not properly identified.  Diagnosis of both conditions can be difficult and require the assessment of how clinical data, such as laboratory results, are related in time to when patients receive certain medications.  The electronic health record (EHR) system at Stanford allows clinicians to review patient medications, but then requires navigation to a separate screen to review other relevant clinical data.  In order to evaluate for the presence of drug reactions, clinicians have to manually compare whether certain medications coincide in time with abnormal laboratory results while often also needing to use separate online reference tools to verify diagnostic criteria.  This can be a time consuming process that is subject to high levels of variation and error, leading to missed diagnoses that can affect patient care. 

Stanford Emerging Apps Lab (SEAL), is a joint effort between Research IT and the Digital Healthcare Integration Team at Stanford Health Care to rapidly innovate, build, and implement lightweight digital apps that integrate into the EHR to improve clinical workflows.  Susan Weber PhD, SEAL’s technical lead, and Dr. Ron Li, SEAL’s clinical informatics lead, collaborated with Dr. Bernice Kwong and Dr. Beth Martin to design an app that automatically organizes and visualizes medication and laboratory data in a more clinically meaningful way that is not possible within the current EHR system. 

The app can be launched from within a patient chart in the EHR and automatically pulls the medications and laboratory results in real time for that specific patient, which are then displayed together in a timeline.  Clinicians can also customize which medications and laboratory results they would want to see on the timeline using a filtering mechanism.   

Visible Ghosts of Isaan, a community health study in Thailand

Research IT's citizen science health equity platform, OurVoice, was used in the Visible Ghosts of Isaan project. Praveena Fernes, a Research Fellow facilitated a study using the OurVoice platform to track and document community health and wellbeing in the Northern Thai region of Rasi Salai, an area impacted negatively by the building of an upstream dam.  Her work and feedback provided us with insights for designing and building out new features in the platform to better enable remote collaboration which coincides well with the new paradigm of remote work in a post-COVID-19 world.  Feature augmentations include one view dashboard on the portal, and more granular troubleshooting protocols for the app.

OMOP for COVID-19 network studies 

Our new research CDW, STARR-OMOP, launched an year ago, was used in a COVID-19 characterization study and published in Nature Communications. The study named CHARYBDIS (Characterizing Health Associated Risks, and Your Baseline Disease In SARS-COV-2) - describes the baseline demographic, clinical characteristics, treatments and outcomes of interest among individuals tested for SARS-CoV-2 and/or diagnosed with COVID-19 compared to Influenza patients from 2017-2018. Data from some of the sites with low rates of COVID-19 infection at the time of manuscript submission, such as Stanford, are not reported in the paper. But you can look at Stanford summary data on the interactive website. The analytic code is available in OHDSI's github repository


Stanford Medicine’s CATCH (Community Alliance to Test Coronavirus at Home) Study aims to track the spread of COVID-19 in the San Francisco Bay Area. Research IT's Solution team helped build a new platform for the study in collaboration with TDS and other business partners. 

COVID-19 wearable study

Our Solution's team helped Stanford Healthcare Innovation Lab (SHIL) complete the first phase of one of the world’s largest studies on infectious disease using data from wearable health monitoring devices, demonstrating sensitive detection of COVID-19 using a smartwatch.


In a collaborative endeavor with Stanford Medicine faculty and Gauss, our Solutions team helped launch and complete the Californians Fighting Against Coronavirus Together Study (CA-FACTS) COVID-19 study. Gauss’s at-home platform is being used to power CA-FACTS and CATCH study. This serology study, conducted across multiple counties in California, employs antibody tests and seeks to understand the prevalence of IgG/IgM antibodies to COVID-19 in the population.

Track COVID Study

The study leveraged Stanford REDCap platform heavily. The purpose of this public health surveillance initiative is to understand how many people are currently infected with COVID-19, how likely it is someone will become infected, and the body’s immune response to the coronavirus. 

SnapDx Study

Our Solutions team is enabling the SnapDx study, an at home screening kit for COVID-19 testing using saliva. Self-collected saliva shows sufficient sensitivities in both qPCR and RT-LAMP tests.

Stanford REDCap Website is now public facing

Previoulsy behind Stanford Single Sign On authentication, the landing page is now publicly accessible. Features of the new Stanford REDCap website include a snapshot of platform features and a daily updated dashboard. Our goal with the new website is to improve findability of this resource by Stanford community and improve submission of grants and publications. REDCap is now also listed on the UIT managed list of approved services for High Risk and HIPAA data. Additionally, the REDCap training is now part of STARS curriculum listing.

Visit Stanford REDCap website

STARR-OMOP launched

Research IT  is building a new generation cloud scale clinical data platform. The phase I of the new platform is launched and contains STARR-OMOP dataset, the Electronic Health Records (EHR) data from Epic Clarity in Observational Health Data Sciences and Informatics (OHDSI) Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM). This is Stanford's first fully de-identified EHR data (High Risk dataset) that is made available to researchers in a secure computational environment prior to IRB approval (non Human Subject Research).

In the STARR-OMOP dataset, aside from the standard encounter tables in OMOP CDM, we populate the clinical notes and note annotations. All data are de-identified in STARR-OMOP-deid including the clinical notes. We use sophisticated NLP, Safe Harbor,  and other approaches such as Hiding in Plain Sight (HIPS) in text de-identification. Although the data is de-identified, it is characterized as a High Risk dataset by the Stanford University Privacy Office.

For clinical text mapping to concepts, we use a we use a pipeline developed by LePendu et. al. , that has incorporated both negation detection and history detection. These contextual cues are based on NegEx and ConText and enable us to discern whether a term should not be attributed the patient's current status (e.g., lack of valvular dysfunction, or sister has muscular dystrophy).

The resulting database contains patient encounter data from ~2.67M patients. Over 60% of the patients have a diagnosis (ICD 9/10), over 40% have medication information (RxNorm), ~80% have lab information (LOINC), and over 95% of patients have clinical notes. 

The underlying technology for the STARR-OMOP-deid dataset is Google Cloud Platform (GCP) BigQuery (BQ). BQ is a highly performant analytical data warehousing technology. BQ supports ANSI-compliant SQL and a powerful Application Programming Interface (API). At launch, access to the STARR-OMOP-deid data is via BigQuery APIs. A future release will support the OHDSI cohort tool, ATLAS.

The data platform also brings a secure infrastructure for Big Data analytics,  Stanford Nero platform. Researchers can request query access to the STARR-OMOP-deid BigQuery dataset, a High Risk dataset, on Nero. Nero is a HIPAA and High Risk compliant Big Data analytical platform, built-in collaboration between Research IT and Stanford Research Computing Center, designed to support analysis of datasets such as STARR-OMOP-deid.  Nero provides a powerful Jupyter Notebooks based modern data science environment for collaborative research. The Research IT team provides training material so users can get familiar with Nero computing environment and the OMOP CDM using synthetic and STARR-OMOP-deid 1% datasets. There is an active slack channel, starrdatausers, for STARR on Nero community.