Open to keen biologists from EMBL & Stanford

Modern Statistics for Modern Biology Mini-Course

Heidelberg 30 March 2020

Susan Holmes, Professor of Statistics at Stanford University and Wolfgang Huber, Group Leader and Senior Scientist at EMBL are taking their teaching online! Their hugely popular book, Modern Statistics for Modern Biology, is already a go-to resource for biologists who want to learn the basic principles of data analysis. 

Course Components:

  1. Pre-recorded lectures
  2. Live teaching labs and discussions
  3. Lab reports and practical exercises

Dates and Location:

Live teaching labs: 1.5 hours every Monday and Thursday at 5:30pm (CEST) / 8:30am (PDT)

Location: via zoom (register now to receive your individual link).

Registration:

Please register your interest using your Stanford or EMBL email address. Once you have registered, you will be sent a link to access the lectures and practical exercises to complete before the labs (you will not receive this information if you do not have a Stanford or EMBL email address).

You will be asked to register for the labs via zoom and you will receive a personalized link to attend the live sessions. Please do not share this link with others as it is unique to you.

There are only a limited number of places for the live sessions. Due to high demand, we ask that you only register if you intend to commit to the full program (i.e. 10 online labs over the next 6 weeks). 

Prequisites: 

A working knowledge of R is essential. You can find online courses using R such as Software carpentries introduction to R, EMBL Bio-IT's Introduction to R, or the edX course: Data Science: R Basics from Harvard University taught by Rafael Irizzary.

"The whole idea of our approach is to make tools to put in the hands of the biologists. We’re saying, “we want to teach you.” Nothing should be a black box."

- Susan Holmes, Professor of Statistics Stanford University

Syllabus

  1. Generative probabilistic models for biological data, Introduction to Bioconductor
  2. Statistical analysis of data; simulations, Monte Carlo and maximum likelihood
  3. Graphics
  4. Mixture models; bootstrapping
  5. Cluster analyses: finding latent groupings
  6. Hypothesis testing
  7. RNA-seq and linear models
  8. RNA-seq revisited: single cell, Gamma Poisson distribution, shrinkage
  9. Multivariate analyses, PCA, SVD, et al. 
  10. Multi-domain, multi-table, multi-omics data
  11. Networks, graphs and phylogenetic trees
  12. Image data
  13. Microbial ecology; abundance testing
  14. Supervised Learning
  15. Design, analysis good practice, good use of computational tools

Lab Schedule (all sessions take place at 5:30pm CEST / 8:30am PDT):

  • Monday 6 April: Introduction (covering: book chapter introduction; lecture 1)
  • Thursday 9 April: Simulations (covering: book chapter 1 & 2; lectures 1 & 2)
  • Thursday 16 April: Graphics (covering: book chapter 3; online lecture 3)
  • Monday 20 April: Mixture Models and Variance Stabilization (covering: book chapter 4; lecture 4)
  • Thursday 23 April: Clustering (covering: book chapter 5; lecture 5)
  • Monday 27 April: Testing and RNA-Seq (covering: book chapters 6 & 8; lectures 6 - 8)
  • Thursday 30 April: Multivariate Analysis (covering: book chapter 7; lectures 9 & 10)
  • Monday 4 May: Networks and Microbiome Data (covering: book chapter 10; lecture 11)
  • Thursday 7 May: Image Data (covering: book chapter 11; lecture 12)
  • Monday 11 May: Machine Learning (covering: book chapter 12; lecture 14)

Read more about research in the Holmes lab at Stanford or the Huber group at EMBL.

 

Wolfgang Huber

Susan Holmes