Introduction to Computational Molecular Biology: Molecular Evolution

Course Number: 
Course Type: 
Currently Offered: 
Instructor (non-MCB Faculty): 
Bill Noble
Course Description: 

Together with Genome 540, a two-quarter introduction to computational methods for analyzing biological data. The course surveys a variety of subfields in computational biology, including various types of sequence analyses—phylogenetic footprinting, searching for non-coding RNAs, motif discovery—as well as microarray analysis, proteomics, systems biology, computational cell biology and computational structural biology.

Genome 540. Students must be able to write computer programs for data analysis. Some prior exposure to probability, statistics and molecular biology is highly desirable.
Required Text: 
No textbook is required for this class.
Course requirements, examinations and grading: 

The entire course grade is based on the homework assignments, which are due weekly. No tests or exams.

  • The homework assignments involve writing programs for data analysis, and running them on a computer that you have access to (we cannot provide computers). We don't require a specific language.
  • Late homework will be accepted, but penalized. Specifically, each assignment is worth 100 points, from which 10 points will be deducted for each day (or fraction thereof) that you turn it in late. The maximum deduction for being late is 60 points (even if you are more than 6 days late).
  • It is OK to run your program on someone else's input data file, and compare outputs to see if you get the same results. However it is not OK to share programs, or to get someone else to debug your program. A key part of the course is being able to write and debug your own programs for data analysis.
  • Homework assignments should be turned in using the Catalyst Tools Dropbox.

Course Web site:

Sample Schedule:

Week Tues Thurs
  1 Predicting protein function from heterogeneous data (Part 1) Predicting protein function from heterogeneous data (Part 2)
  2 Motif discovery Protein identification from tandem mass spectra
  3 Complex biological networks Complex biological networks
  4 Comparative sequence analysis and phylogenetic footprinting Comparative sequence analysis and phylogenetic footprinting
  5 Phylogenetic inference: what can it tell us about the origins of new influenza strains such as H7N9? What type of evolutionary process is being assumed in phylogenetic inference? The concept of evolutionary distance and the molecular clock.
  6 Likelihood calculations on a tree. Bayesian phylogenetics. Bayesian phylogenetics continued. Introduction to BEAST. Inference of ancestral states (if time is available).
  7 Reconstructing the transcriptional regulatory network I Reconstructing the transcriptional regulatory network II
  8 Statistical genetics I (Genome-wide association studies) Statistical genetics II (Haplotype reconstruction)
  9 Protein structure Protein sequence analysis
 10 Molecular modeling Nucleic acid structure
Areas of Interest: 
Molecular Structure & Computational Biology
Methods Area: 
Gene Expression, Cell Cycle & Chromosome Biology
Genetics, Genomics & Evolution