Building Data to Prevent Chronic Disease

December 10, 2025 –  What if every primary care visit in the United States could help predict - and even prevent - the next major health challenge? This vision of proactive health, with its focus on leveraging data to prevent disease and promote well-being, is driving the new American Family Cohort (AFC) Health initiative funded by the Advanced Research Projects Agency for Health (ARPA-H). The initiative, a collaboration between Stanford University and the American Board of Family Medicine (ABFM), aims to transform the way primary care data is collected, validated, and shared, creating a more complete and actionable picture of health in communities across the country.

Primary care is often described as the backbone of the U.S. health system, the place where early signs of illness first surface and where long-term prevention should take root. It is where most people have their trusted health care relationships and where more than a third of all health care visits are made. Yet, the data that guides national health decisions often comes disproportionately from large urban hospitals or academic medical centers, which do not reflect the experiences of many Americans. Millions of people receive care in small, rural practices, and these individuals are often overlooked in discussions about the best ways to prevent disease. Without their stories, policymakers and health systems are frequently left with an incomplete map of the nation’s health.

The American Family Cohort seeks to change this by developing the first comprehensive and privacy-protected national primary care electronic health record resource. The project will bring together clinical information with social, environmental, and behavioral factors that shape health and well-being.

“As part of thinking broadly about preventing disease, we want to bring in information on all of the factors that impact people’s health and make it easier for them to be healthy. We will integrate factors like housing stability, food access, and neighborhood air quality - all things that impact health and may reveal not only why people are sick but what might prevent the illness in the first place,” said David Rehkopf, ScD, Stanford Professor of Epidemiology and Director of the Stanford Center for Population Health Sciences.

To build the dataset, researchers at Stanford will collaborate with ABFM to expand the existing PRIME Registry, the largest national Qualified Clinical Data Registry for primary care. Established by ABFM, the PRIME Registry already includes more than 1300 primary care practices across all 50 states. The team will validate the information using both quantitative methods and qualitative insights from patients and clinicians, ensuring that the data reflects real experiences rather than just what appears in medical charts.

Protecting patient privacy is a critical pillar of the initiative. Currently, the PRIME Registry has systems and processes in place to protect patient privacy, only providing approved users with access to de-identified real-world data under strict oversight. As part of this initiative, with guidance from a Data Community Advisory Committee that includes privacy experts and primary care physicians, efforts to protect patient privacy will be expanded.

The initiative will also leverage technical innovations to enhance patient privacy. Relying on advances in generative AI, the Stanford Center for Population Health Sciences will create synthetic versions of the data that closely mirror the real data. Investigators can use the synthetic data to assess the data’s structure and content, determining early on if it can be used to answer their study questions.

“Synthetic data contains no information about real people but follows some of the same patterns as real-world data. It is the ultimate approach for preserving patient privacy,” explains Rehkopf.

The project's potential impact is substantial, particularly in the prevention of chronic diseases.

“Primary care clinicians are very good at helping patients prevent and navigate chronic disease, but AFC allows the collective data of millions of patients to inform how we identify patients at the greatest risk of developing chronic disease or of having worse outcomes and better understand how to intervene earlier and prevent those from happening,” said Bob Phillips, MD, MSPH, Executive Director of the Center for Professionalism & Value in Health Care. “These data can also help us understand the personal and environmental factors that increase risk of disease or poorer outcomes to understand where to intervene with the greatest effect.”

This initiative represents a significant step toward a more proactive health system. Data collected in everyday clinic visits can fuel new insights, guide AI tools, inform policy decisions, and support healthier communities. By ensuring that patients from small towns, urban centers, and all points in between are represented, the project aims to build a foundation for a healthier future across the United States.