Detecting Positive Selection, Thomas lab

Introduction

Positive selection is the process by which new advantageous genetic variants sweep a population. Though positive selection, also known as Darwinian selection, is the main mechanism that Darwin envisioned as giving rise to evolution, specific molecular genetic examples are very difficult to detect. Pioneering work by Yang and Nielsen has provided a much more powerful methodology for detecting positive selection at the sequence level. To understand

The neutral model

Based largely on a brilliant series of papers by Kimura in the 1960s and 70s, the neutral model of evolution has become the standard against which positive selection must be detected. In the neutral model, the vast majority of mutations are divided into two groups. The first group, for which the model is named, is selectively neutral (or nearly neutral) mutations which become fixed in a species by genetic drift. These changes account for almost all the observable nucleotide changes between two species. The second group is selectively deleterious mutations, which arise continuously and are eliminated over time by natural selection. Because these mutations are eventually eliminated from a species, they are rarely observed when comparing the genomes of two species. On the other hand, they are the basis a substantial fraction of population diversity within a species. Because they cause mutant phenotypes, these mutations are well known to functional geneticists, since they account for nearly all of the mutant strains and human diseases that are much studied throughout biology and human health.

Positive selection

Though advantageous mutations are of great interest, they are difficult to detect and analyze because of the fact that neutral and deleterious mutations predominate them in frequency. Two major classes of methods are currently in use to detect positive selection: population methods, based on analyzing the nature and frequency of allele diversity within a species, and codon analysis methods, based on comparing patterns of synonymous and nonsynonymous changes in protein coding sequences. Unfortunately, a nearly complete lack of population sequence information in nematodes (at least for now), limits our analysis to the latter methods.

A Simple Primer on codon based methods for detecting selection

The essence of this method is easy to state and very difficult to implement. Protein codons have fortuitous properties that make it uniquely feasible to detect patterns of neutral mutations, deleterious mutations, and advantageous mutations. The simplest version of those patterns can be seen by considering the codon for a single amino acid in a hypothetically large number of related species (the same codon position in the same - orthologous - gene) . This codon in each of these related species is identical by descent from a single codon present in their last common ancestor. To simplify, consider that we know the ancestral codon is ACT, which codes for threonine. There are nine possible single nucleotide changes that can occur in this codon (each of the three possible changes at each of the three positions). Three of these nine changes give rise to another codon that codes for threonine (any change at the third position). We will consider these to be selectively neutral since they don't change the encoded protein, where the large majority of natural selection acts (codon bias is a wrinkle on this rule that I won't cover here). The other 6 changes alter the encoded amino acid to isoleucine, asparagine, serine, proline, or alanine, depending on the specific mutation. In accord with the neutral model of evolution, we will consider as default that all of these changes are selectively neutral or deleterious.

Under development.

Thomas lab index page