Researchers get more than $23 million to launch centers for big-data research

Two new Stanford-based centers aim to help scientists effectively manage and use large, complex data sets.

- By Bruce Goldman

Mark Musen

The National Institutes of Health has awarded two grants, totaling more than $23 million, to establish two centers of excellence at Stanford for big-data research.

Engaging interdisciplinary teams of scientists, both centers will focus on teasing out meaning from enormous amounts of biomedical data that often languish in isolated locations and go underused by researchers.

The grants are among 12 awarded to launch centers of excellence at 10 institutions nationwide under an NIH initiative known as Big Data to Knowledge, or BD2K. About $11 million over four years will go to a center led by Mark Musen, MD, PhD, professor of biomedical informatics research. A separate center headed by Scott Delp, PhD, professor of bioengineering and of mechanical engineering, will receive roughly $12 million over four years.

The BD2K centers are geared toward helping scientists effectively manage and use large, complex data sets. Organizing and consolidating diverse but siloed data from laboratory studies, clinical notes and wearable devices will allow scientists to compare, contrast and combine study results to draw more accurate conclusions that can be used to develop superior medical therapies and understand human behaviors.

The two grants come as Stanford Medicine continues its efforts to advance the use of big data in biomedicine, most recently with the launch of Biomedical Data Science Initiative. “With our engines of basic and translational research, our computational expertise and our history of tackling difficult problems, Stanford Medicine is poised to harness the power of big data to improve health around the world,” said Lloyd Minor, MD, dean of the School of Medicine. “We are most grateful to the NIH for its support of two new centers of excellence at Stanford and its recognition of the innovative work our faculty is doing to accelerate discovery from large data sets and provide new insights into health and disease.”

Scott Delp


Under Musen’s direction, the Center for Expanded Data and Retrieval, or CEDAR, will address the need to standardize descriptions of diverse biomedical laboratory studies and create metadata templates for detailing the content and context of those studies. Metadata consists of descriptions of how, when and by whom a particular set of data was collected; what the study was about; how the data are formatted; and what previous or subsequent studies along similar lines have been undertaken.

“It’s often impossible for investigators to reproduce one another’s experiments,” Musen said. “Often, it’s even impossible to find the experimental data that another investigator collected, or to have clear insight into how the experiment was conducted.”

Musen said the current big data situation in biomedicine is like that of “a library with no central catalog or, at best, a catalog with lots of missing entries that makes it hard or impossible to know where a resource might be located, what language it’s written in or what the resource is even about.”

CEDAR will develop information technologies that will make the generation of complete metadata more manageable and hence make big data more accessible.

The Mobilize Center

Under Delp’s direction, the National Center for Mobility Data Integration to Insight, known as the Mobilize Center, will aim to overcome the limitations scientists face in using the vast amounts of data that have been collected by hundreds of research labs studying the details of motion in both healthy people and people with movement disorders, as well as motion-related data accruing from millions of wearable sensors and smartphone accelerometers.

 “Mobility is essential for human health,” Delp said. “Regular physical activity helps prevent heart disease, stroke and diabetes. It improves brain function among older adults, relieves symptoms of depression, and promotes weight loss.”

Unfortunately, he said, a wide range of conditions impose limits on mobility, at a great cost to public health and personal well-being.

“The proliferation of devices monitoring human activity, including mobile phones and an ever-growing array of wearable sensors, is generating unprecedented quantities of data describing human movement, behaviors and health,” Delp said. “With the insights gained from subjecting these massive amounts of data to the Mobility Center’s state-of-the-art analytical techniques, we hope to enhance mobility across a massive segment of the population. We will explore ways to reduce running injuries and improve walking in children with cerebral palsy, and to more accurately identify people at risk for conditions like arthritis that significantly reduce mobility. Our center can also help fight the obesity epidemic by putting scientifically validated mobility and fitness tools into the pockets of the population.”

About Stanford Medicine

Stanford Medicine is an integrated academic health system comprising the Stanford School of Medicine and adult and pediatric health care delivery systems. Together, they harness the full potential of biomedicine through collaborative research, education and clinical care for patients. For more information, please visit

2023 ISSUE 3

Exploring ways AI is applied to health care