We've been handling mutations and alleles as abstract concepts; this lecture will look at the molecular biology behind the abstraction.
The ratio of synonymous (silent) to non-synonymous (coding) mutations is an indicator of the type of selection. First we have to allow for the fact that there are more possible coding changes than silent changes. The solution is to calculate silent changes per silent position and coding changes per coding position.
We expect DS to be much greater than DN (equivalently, for their ratio to be greater than 1) in most coding genes, since most coding changes are likely to be rapidly removed or fixed by selection. In a gene which is not under selection at all (for example, a pseudogene) the ratio will be close to one.
Gene regions where DS/DN is less than one are interesting. This means that coding substitutions are actually more likely to be fixed than silent ones. This is seen in genes which are selected for diversity, such as the outer coat of influenza virus, or antibody genes in mammals. Often only one part of the gene (such as the active site) has excess coding substitutions, while other parts have mainly silent substitutions.
The genetic code itself must have evolved. It is not quite universal- mitochondria and chloroplasts, and some bacteria, have slight variants. Probably only a creature with a tiny genome can afford to change its code. The organism may go through a phase where it does not use a specific codon at all, and then re-introduce that codon with a new meaning.
Amino acids with similar chemical properties are somewhat clustered in the code. It may have evolved to reduce the rate of serious errors, but this is controversial.
Often the problems caused by under- or over-expression are because of imbalance. It would be okay to have ten times more gene A product if only you had ten times more gene B product as well, but an imbalance is harmful. This is thought to be why duplication of a whole chromosome is usually disadvantageous
Regulatory mutations come in two kinds. Some affect the regions of a gene that allow it to be controlled, particularly the promoter and enhancers. These are usually recessive if they remove expression in a particular tissue, time, or situation, and dominant if they add expression. They affect only the gene copy to which they are attached.
One way to get this type of regulatory mutation is a rearrangement of the genome that puts one gene next to another gene's control sequences. Lysosyme might have moved next to a stomach-enzyme gene and thus become controlled by stomach-specific sequences.
The second kind of regulatory mutation affects a regulating gene. Such mutations are often dominant if they increase the gene's ability to regulate, and recessive if they eliminate it. For example, lacI is a regulatory gene that controls the expression of other lac genes. If lacI is damaged and stops working, the lac genes are recessively uncontrolled. If lacI is mutated so that it binds too tightly and cannot be removed, this is a dominant mutation. (Bacteria are normally haploid, but we can assess dominance/recessiveness by artificially adding another copy of the gene.) This type of regulatory mutation affects both copies of the target gene, not just the one it is linked to.
One relatively common way to evolve a new function is to combine parts of two working genes.
Why are introns present in the first place? Many eukaryotic genes do not function correctly without their introns. Processed pseudogenes are genes which were created by reverse-transcribing processed mRNA back into DNA and integrating it back into the genome. They lack introns and have poly-A tails like mRNA. Usually they are inactive. Apparently if the intron- removal process is skipped, the mRNA does not get processed correctly.
However, most bacteria and some eukaryotes (such as yeast) manage fine without introns.
It has been suggested that introns are useful because they break genes into functional pieces which evolution can mix and match to make new genes. This is controversial because the creation of new genes is quite a slow process and it's not clear that selection in favor of new-gene-creation would be strong enough to maintain introns. Even if a creature with no introns cannot easily evolve new gene functions, this wouldn't seem to eliminate it from the gene pool quickly. Lack of introns could become fixed by drift before selection had a chance to act.
Also, bacteria have few introns and yet manage to evolve very well!
Fairly closely (we think) related bacteria can have quite different codon bias. It is believed that if a gene is transferred among bacterial species it will slowly develop the codon bias of its new host. This has been used to try to estimate the rate of gene transfer among bacteria; high- expression genes with deviant codon bias are assumed to be recent imports. This method suggests that 1% or more of the E. coli genome may be foreign DNA.
The proportion of A/T versus C/G also varies. Mitochondria are very A/T rich. Bacteria and archaea evolved to live in near-boiling water are very G/C rich. G/C bases bind more tightly to each other than A/T; perhaps A/T richness speeds up replication, and G/C richness protects against heat-induced DNA unwinding. It is not clear whether the bias is due to differences in mutation rate, differences in fixation rate, or both.