PHI Scrubbing

STARR Tools offers the option of conducting chart review on data files that have had PHI removed using the best-effort techniques described below. To create a chart review with PHI scrubbed, simply select the "Scrub PHI (best-effort)" option from the "PHI?" radio-button options when saving your cohort for chart review.

If you have a need to review PHI, you can gain access to PHI by submitting your research protocol for approval to the IRB. Be sure to add a Data Privacy Attestation in the "Confidentiality Protections" section. 

HIPAA imposes significantly punitive penalties on the individuals and institution responsible for inadvertent disclosure of PHI.  In order to reduce the risk of incurring these penalties, Stanford Hospital Compliance has directed the STARR Tools system to scrub PHI from data download files to the extent possible. Please carefully consider the risks before requesting an exemption from this policy. 

Free Text - Hiding in Plain Sight

In order to make free text data as useful for research as possible, while maintaining to the extent possible the privacy of the patients involved, we use a "hide in plain sight" approach to PHI scrubbing.  This involves replacing patient names, MRNs, and other free text identifiers with plausible looking replacement strings. The idea is that in the cases where the actual patient data is overlooked by the scrubbing process, it will be indistinguishable from the other occurrences of similar looking identifiers.

Names and MRNs - Suppression, or Hiding in Plain Sight

For cohorts requesting or requiring that patient names and/or Medical Record Numbers (MRNs) be removed, patient names and MRNs are suppressed from the list of patients in Chart Review, and replaced with plausible looking alternative names or MRNs in places where they occur in free text, such as clinical notes and reports.

Dates - Shifting, or Age at Event

Dates can either be sytematically shifted from their original value, or replaced by the patient age at event along with the year the event took place.

When date shifting is selected as the scrubbing technique for dates, all dates for a given patient are shifted by the same amount, in order to preserve the exact timeline for that patient. Different shift values are used for different patients.

When "age at event" is selected, the date of service or encounter is converted into patient age in years, represented as a floating point number with sufficient precision to pin down to the minute when the event occurred in the patient timeline.

Other Identifiers - Coded Replacement

Other identifiers, such as accession numbers or the various numeric identifiers used in the EMR to denote specific encounters or orders, are systematically replaced with coded alternatives. This means that these identifiers still function correctly as "join keys" when used to join other elements of the chart from within the same dataset, but will not match up to real identifiers in the source medical record.

Further Reading

For a more in-depth discussion of this subject, including a detailed explanation of anonymization in DICOM (radiology) images, please see this whitepaper.