Data

We offer a diverse collection of high-value data sets and partner with several data sources around the world.

PHS Data

Effective population health research requires rich and diverse data with opportunities for potential linkage and long-term follow-up. Stanford PHS offers access to a growing portfolio of population-level data to Stanford researchers and affiliates. These diverse data covering precision and representation of the population are a catalyst for transdisciplinary research and enable researchers to study a myriad of vital outcomes.

An overview of the datasets available from PHS is below.

DATASET	DATASET TYPE	POPULATION	SMALLEST GEO UNIT	SAMPLE SIZE	DATE RANGE	TIME TO ACCESS	STRENGTH
American Family Cohort (AFC)	EMR - Primary Care	US	Census Block	8 million	2010 - 2024	1 month	linkable by individual
MarketScan	Claims - Commercially Insured	US	Metropolitan Area	149 million	2006 - 2022	7 days	prices, variability in insurance type
Medicare 20% sample	Claims - Medicare	US	9 digit zip	11 million	2006 - 2020	6 - 9 months	representative of Americans over 65; rich, longitudinal
Medicaid 100%	Claims - Medicaid	US	5 digit zip	Over 100 million	2011 - 2019	6 - 9 months	representative of Americans enrolled in Medicaid
SEER and CA Cancer Registry - CMS linked data	SEER and CA Cancer Registry will do linkages w/CMS	US	5 digit zip	Varies	Varies	3 - 6 months	Linked dx/treatment data
Aarhus Danish Registers	National cohort, Surveys Administrative data, Biologic samples	Denmark	Census Block	5 million	1968 - 2020	No direct access. 3 - 6 months	Rich, longitudinal, Individual linkages

Data Portal

The Data Core at the Stanford Center for Population Health Sciences offers researchers

A central hub to efficiently access, link, visualize and analyze data from a wide variety of sources; and,
A library of data assets to facilitate transdisciplinary population health science projects and collaboration.

Powered by Redivis, our Data Portal enables you to explore and learn about tools to manipulate millions of records. Once you have identified data of interest you will need to meet several requirements to ensure responsible use of sensitive data. You can read more about requirements in the access section of each dataset on the PHS Data Portal.

Visit the Data Portal

Getting started

On our Data Portal, you can apply for membership and access, explore datasets, and use the Redivis tool to acquire your analytical sample. After cutting your analytical sample and learning about the data, you will run your analyses on our secure PHS server or Nero. Please consult our PHS data docs resource to get started.

PHS data docs

Data Requests

You can file a data request if you are interested in datasets we currently do not yet offer. Please use our Data Request Dashboard to submit.

Data Request Dashboard

New Data Portal Training

Learn how to utilize Redivis, a data platform used to store and query data on the PHS Data Portal, for every stage of your analytical workflow. This presentation showcases common methodologies in working with large claims datasets, including scalable cohort generation and analytical workflows in R, Python, Stata and SAS. The session concludes with an exploration of using modern ML techniques to classify patient notes and other unstructured data.

Watch presentation

Access presentation slides

Explore a sample project on Redivis

Getting help

We offer several sources of support.

PHS data docs

Read all about how working with PHS data from start to finish in this step-by-step guide, including more information about our systems and FAQs.

Slack user channel

Your second line of support is more interactive and great for quick questions that you can't resolve from the PHS data docs alone. You can also search the channels for your questions as it may have been asked before.

Office hours

Your third line of support is to schedule a meeting with us. We are happy to sit down with you for more complicated questions and issues that are best resolved in conversation.

Contact us

We are happy to support your data questions and suggestions. You can contact us at phsdatacore@stanford.edu

Find a doctor

Clinics & Services

Research Resources

Professional Training

Education Resources

People

Research Focus Areas

PHS Grants

Fellowships

Director's message

Events

2021 BiB-PHS: Supporting Children in the Time of COVID-19

News

Data

PHS Data

Data Portal

Getting started

Data Requests

New Data Portal Training

Getting help

PHS data docs

Slack user channel

Office hours

Contact us