Core Rigor and Reproducibility Courses

Introductory Level

This course will focus on skills and processes to computational approaches to ensure that all data, code, and analyses can be captured in a reproducible workflow, to be confirmed and replicated by you in the future, by other members of your team, and by reviewers and other researchers. We will cover how to meet NIH and NSF data management requirements, write data management plans (DMPs), satisfy FAIR principles, version control, how to utilize Github and other repositories, and how to create a reproducible dataset. Prerequisites: Basic knowledge of R. Recommended courses (not required): EPI 202 or 261/262, STATS 60, or MS&E 125

BIOS259 equips students with essential skills for computationally reproducible research. It covers topics such as data management, version control, and containerization. Its hands-on approach fosters practical experience crucial for computational research.

Intermediate Level

Introduction to foundations of rigorous, reproducible research in experimental biology and clinical research. Provides conceptual framework for linking hypotheses to experimental design, quantitative measurement, statistical analysis and assessment of uncertainty. Course combines lecture presentation and discussion of core concepts from statistics and reproducibility with hands-on exposure to best practices for reproducible workflows spanning design, data collection, annotation, analysis and presentation of results. Brief discussion of social, legal, and ethical issues with reproducibility in scientific practice, along with NIH grant requirements. Course provides foundations for future learning in these areas. Examples drawn from multiple areas of experimental biology and clinical research. Target audience: Students in BIOS 200 (Foundations in Experimental Biology), in Biosciences graduate programs or T32 training programs. Prerequisites: None

This course provides an introduction to key principles of rigorous and reproducible population health and clinical research. The course consists of three modules. In the first, ethical, regulatory, and legal aspects of research integrity will be covered, such as authorship, collaboration, conflicts of interest, and data sharing agreements. The second module focuses on design and reporting considerations for rigor and reproducibility, such as threats to validity, proper interpretation of statistical measures, and reporting guidelines. The third module provides technical training in collaborative workflows and reproducible programming practices using Github and R. Content is designed for health policy, biomedical data science, and epidemiology graduate students supported by NIH training grants with reproducibility training requirements. Students in such programs should consult with their program director to ensure that this course will fulfill specific requirements of their program.

Additional Courses

This course covers the designs and methods that should be used to evaluate technologies to diagnose patients, predict prognosis or other health events, or screen for disease. These technologies can include devices, statistical prediction rules, biomarkers, gene panels, algorithms, imaging, or any information used to predict a future or a previously unknown health state. Specific topics to be covered include the phases of test development, how to frame a proper evaluation question, measures of test accuracy, Bayes theorem, internal and external validation, prediction evaluation criteria, decision analysis, net-utility, ROC curves, c-statistics, net reclassification index, decision curves and reporting standards.

Audience:  Graduate students with an understanding of introductory biostatistics, epidemiologic and clinical research study design. Undergraduates may enroll with consent of the instructor.

The course will consist of readings and discussion of foundational papers and book sections in the domains of statistical and scientific inference. Topics to be covered include philosophy of science, interpretations of probability, Bayesian and frequentist approaches to statistical inference and current controversies about the proper use of p-values and research reproducibility.

Audience:  Second year Masters students or PhD students with at least 1 year of preceding graduate training.

This introductory course is a practicum in which students will learn the basics of R and use the programming language to analyze health datasets by application of classical statistical methods. A familiarity with basic descriptive and inferential statistics is required. Class sessions will include some lecture content and hands-on coding by each student on their own computers. Students will practice using R with open-source and simulated datasets. The primary goal of the course is to equip students with a basic and fundamental understanding of R's capabilities, experience using R with practice datasets, and the ability to extend their facility with R as their needs dictate. 

Audience: Graduate students. Undergraduates may enroll with consent of the instructor.

This introductory course provides hands-on introduction to basic data management and analysis techniques using SAS. Data management topics include: Introduction to SAS and SAS syntax, importing data, creating and reading SAS datasets, data cleaning and validation, creating new variables, and combining data sets. Analysis techniques include: basic descriptive statistics (e.g., means, frequency) and bivariate procedures for continuous and categorical variables (e.g., t-tests, chi-squares).

Audience: Graduate students. Undergraduates may enroll with consent of the instructor.

Run as part of the Stanford Cardiovascular Institute's (CVI) Summer Research Program, small teams of summer undergraduate and medical students work under the guidance of early career scientists to conduct a metascience study in the field of cardiovascular research. Teams use open and reproducible methods to survey published cardiovascular articles about how research is being conducted. Teams ultimately create a database of screened articles and a public protocol, or pre-registration, about their study. This program provides students with a deeper understanding of cardiovascular literature and how to evaluate and deploy rigorous and open scholarship practices in their own research.

Undergraduate and medical students participating in the CVI Summer Research Program, and graduate students and postdocs who wish to lead teams. Please contact Adrienne Mueller at if you are interested in leading a MAvERICS team.

If you are interested in any of the above courses, please visit to register.