Module 1: Probability and Statistical Inference
Instructors: Jim Hughes and David Yanez
Module description: This module covers the laws of probability and the binomial, multinomial, and normal distributions. It covers descriptive statistics and methods of inference, including maximum likelihood, confidence intervals and simple Bayes methods. Classical hypothesis testing topics, including type I and II errors, two-sample tests, chi-square tests and contingency table analysis, and exact and permutation tests. Resampling methods, such as the bootstrap and jackknife, are covered as well. This module serves as a foundation for almost all of the later modules. Co-taught with Summer Institute in Statistical Genetics (SISG Module 1).

Module 2: Mathematical Models of Infectious Diseases
Instructors: Pejman Rohani and John Drake
Module description: This module covers the principles of deterministic mathematical models of infectious diseases. The focus is on dynamic models. The module will focus on the dynamics of susceptible-infected-recovered (SIR) models, and variants such as SI, SIRS, and SEIR models. Topics include the different types of heterogeneities in transmission (resulting from age-structure, behaviour or seasonality), exact stochastic birth-death models and fitting of simple models to data. The module will cover modeling longitudinal data and temporal patterns of infectious disease dynamics. Programming will be done in R. Background Reading: Matt J. Keeling & Pejman Rohani. Modeling Infectious Diseases in Humans and Animals. 2008. Princeton University Press.

Module 3: Causal Inference
Instructors: Michael Hudgens and Thomas Richardson
Module description: This module provides an introduction to the potential outcomes approach to causal inference. Topics such as potential outcomes, underlying assumptions about the assignment mechanism and no interference between units will be covered. Approaches for observational studies, such as marginal structural models will be covered. Applications of causal inference in infectious diseases will include the relaxation of the no interference assumption, selection bias in postinfection outcomes, and surrogates of protection. Assumes material in Module 1.

Module 4: Introduction to R
Instructors: Ken Rice and Timothy Thornton
Module description: This module introduces the R statistical environment, assuming no prior knowledge. It provides a foundation for the use of R for computation in later modules. In addition to discussing basic data management tasks in R, such as reading in data and producing summaries through R scripts, we will also introduce R’s graphics functions, its powerful package system, and simple methods of looping. Examples and exercises will use data drawn from biological and medical applications, including infectious diseases and genetics. Hands-on use of R is a major component of this module; users require a laptop and will use it in all sessions. Co-taught with Summer Institute in Statistical Genetics (SISG Module 6).

Module 5: MCMC I for Infectious Diseases
Instructors: Kari Auranen, Elizabeth Halloran, and Vladimir Minin
Module description: This module is an introduction to Markov chain Monte Carlo (MCMC) methods. The first half of the course includes a general introduction to Bayesian statistics, Monte Carlo, and MCMC. Some relevant facts from the Markov chain theory are reviewed. Algorithms include Gibbs sampling and Metropolis-Hastings. A practical introduction to convergence diagnostics is included. Motivating practical examples progress from generic toy problems to infectious disease applications, which include chain-binomial and general epidemic models. Programming will be in R. Prior familiarity with R would be helpful, but not required. Individuals already familiar with MCMC methods and knowledge of R programming should consider MCMC II. It assumes the material in Module 1.


Module 6: Infectious Diseases, Immunology and Within-Host Models
Instructors: Andreas Handel and Paul Thomas
Module description: This module provides an introduction to infectious diseases, the main components of the immune system, and mathematical modeling. Using pathogens such as HIV, TB, malaria, influenza and others, this module will introduce basic immunological concepts and explain how to use mathematical models to study aspects of within-host infection dynamics. The focus will be on simple compartmental deterministic models. The use of those models to analyze the dynamics of pathogens, innate and adaptive immune responses and to design and evaluate intervention strategies, such as vaccines and drug treatments, are covered. Hands-on exercises using the programming language R will show how to construct and implement models. Prior knowledge of R is helpful but not required. Suggested background reading: How the Immune System Works, by Sompayrac, L.M., Wiley-Blackwell, 3rd edition, 2008. This module would be helpful for those taking Module 8.


Module 7: Stochastic Epidemic Models with Inference
Instructors: Tom Britton and Ira Longini
Module description: Topics include statistical methods and stochastic models for the spread of an infectious disease. The course will cover large sample properties of the model and how to obtain estimates of relevant model paramaters from epidemic data. The module will show how to obtain estimates of the critical vaccination coverage, the fraction needed to vaccinate to avoid future epidemics. The course will initially cover this for a simple epidemic model. Various extensions allowing for different types of individuals, social structures in the community, such as households, and vaccine responses will be covered. Class exercises will include solving problems and computer programs in R. Assumes the material in Module 1.

Module 8: Evaluating Immune Correlates of Protection
Instructors: Ivan Chan and Peter Gilbert
Module description: In vaccine development, identification of immune responses that correlate with clinical efficacy is a very important goal. Once established, good correlates of protection (in lieu of efficacy outcomes) can be used as primary endpoints in clinical trials, which will speed up vaccine development and significantly reduce the associated cost. Topics include methods for evaluating immunological correlates of risk and surrogates of protection in vaccine efficacy trials, and complementary sieve analysis of the breakthrough pathogen sequences in infected trial participants. Approaches include regression models, causal inference methods, and high-dimensional data analysis of pathogen sequences. Class exercises include analysis (with R) of vaccine efficacy trial data-sets. Assumes the material in Module 1. Module 4 would be helpful, but not required.

Module 9: Network Theory in Infectious Diseases
Instructors: Thomas Hladish and Joel Miller
Module description: Topics include introduction to concepts of graph theory, including nodes, edges, shortest path, giant component, among others. Methods to estimate characteristics from a complete graph will be covered. Methods on how to model the spread of infectious disease on a contact network will be taught. Programming is done in Python, which will be covered in the course. Assumes material in Module 1 and either 2, 5, or 6.

Module 10: MCMC II for Infectious Diseases
Instructors: Theo Kypraios and Philip O’Neill
Module description: This module continues on from Module 4 by looking in detail at practical implementation issues for MCMC methods when applied to data from infectious disease outbreaks. The main focus will be towards inference for the SIR (susceptible-infected-removed) model. Topics include parameterisation, methods for improving convergence, assessing MCMC output, and data augmentation methods. Programming will be carried out in R. The course assumes all the material in Module 7. The material from Module 2 or 5 would be helpful, but not required.


Module 11: Stochastic Simulation Methods
Instructors: Dennis Chao and Alessandro Vespignani
Module description: This module provides an introduction to the use of stochastic models in studying epidemics. The principles of infectious disease spread in populations will be covered, including population structure, natural history of the infectious agent, and assumed interventions. These topics will be illustrated using the Reed-Frost model and FluTE, an individual-based model of influenza epidemics. The effect of stochasticity in spatial transmission models will be covered including stochastic metapopulation and patch models, reaction-diffusion systems in heterogeneous networks, stochastic effective and mechanistic mobility couplings. Data driven stochastic approaches to the large-scale spreading of infectious diseases integrating mobility and population high-resolution data will be presented. The tutorial will be presented by using the EpiC and the GLEaM computational suites. All software packages used in this module are open source. GIS software will be demonstrated. Familiarity with statistical packages (such as R or similar) would be helpful.

Module 12: Spatial Statistics in Epidemiology and Public Health
Instructors: Jon Wakefield and Lance Waller
Module description: Spatial methods are now used in many disciplines and play an important role in epidemiology and public health. This module gives an introduction to spatial methods. In particular, we will present methods for assessment of clustering, cluster detection, spatial regression, and disease mapping. Methods will be described for both point data, in which cases and non-cases (or a sample thereof) have an associated point location, and count data, in which the numbers of cases and non-cases in a set of geographical areas are available. An introduction to Geographic Information Systems (GIS) will be provided. The important extension to space-time analysis will be described, which is crucial for the analysis of infectious disease data with a spatial component. Many examples will be presented, with analysis carried out in the R programming environment. Reference: Waller, L. and Gotway, C. (2004). Applied Spatial Statistics for Public Health Data. New York, John Wiley and Sons. Prerequisites: Assumes the material in Module 1. Some prior knowledge of R would be helpful.

Module 13: Introduction to Metagenomic Data Analysis
Instructors: Alexander V. Alekseyenko and Paul J. McMurdie
Module description: This course is concerned with analysis of microbial community data generated by next-generation sequencing technologies. These high-throughput methods allow for deep surveying of microorganisms inhabiting their biological hosts. We will cover the steps for preprocessing Roche 454 sequencing data, necessary to produce abundance tables. We will then examine methodology for associating microbial abundance data with experimental factors and outcomes. Programming will be done in R, and overview of available programs for preprocessing the data will be provided. Pre-requisites: Module 1: Probability and Statistical Inference. Co-listed with Summer Institute in Statistical Genetics.

Module 14: Evolutionary Dynamics and Molecular Epidemiology of Viruses
Instructors: Philippe Lemey and Marc A. Suchard
Module description: This module covers the use of phylogenetic and bioinformatic tools to analyze pathogen genetic variation and to gain insight in the processes that shape their diversity. The module focuses on phylogenies and how these relate to population genetic processes in infectious diseases. In particular, the module will cover Bayesian Evolutionary Analysis by Sampling Trees (BEAST). This software will be used in class exercises that are mainly focused on estimating epidemic time scales, reconstruction changes in viral population sizes through time and inference of spatial diffusion of viruses. Evolutionary processes including recombination and selection will also be considered. Assumes material in Module 1. Co-listed with Summer Institute in Statistical Genetics.