Random Mutation and Natural Selection
A common argument from the anti-evolution crowd is that random mutation plus natural selection cannot result in the complexity of life we observe on earth. This, of course, ignores all of the other evolutionary forces that work on natural populations and represents a general ignorance of the modern theory of evolution. Furthermore, some people are unclear as to what biologists mean when referring to random mutations. In this post I explain the statistical definition of random, some different classes of mutations, and the random nature of genetic mutations.
1. What it means to be random
In order to understand the statistical meaning of random one must first have some familiarity with probability. Probability is just a mathematical way of describing how likely a certain event will occur relative to all alternative events. For example, there are two possible outcomes when flipping a coin: heads and tails. Assuming the coin is fair (i.e., heads and tails are equally likely), we have a 50% chance of flipping heads and a 50% chance of flipping heads. Another way to describe this is the probability of heads is 0.5 and the probability of tails is 0.5. Notice how the probability of heads plus the probability of tails is equal to one (0.5 + 0.5 = 1); we refer to this as a complete set as it includes all possible outcomes, and the sum of the probability of all those outcomes is equal to one.
The result of a coin flip is random NOT because we have no idea how it will result, but because we cannot say for certain how it will result. We have some idea what will happen – half of time we’ll flip heads, and half of the time we’ll flip tails. The coin flip is said to be random because we cannot say for certain what will happen, but we can determine the probability of each result. Random is another way of saying “not directed” (i.e., there is nothing determining absolutely the result of a particular trial or run).
Now, imagine if the coin is not fair and the probability of heads is 0.6 and the probability of tails is 0.4 (i.e., 60% of the time you will flip heads and 40% of the time you will flip tails). Flipping the coin will still be a random process, but you will flip heads more often than tails. A common misconception is that all results are equally likely in a random process; this is not the case. A random process only implies that every possibly outcome has some assigned probability, and that probability is the only thing that influences whether or not a particular event occurs.
2. Genetic Mutations
Before I get into the random nature of genetic mutations, I think it’s necessary to describe some different types of genetic mutations. Readers familiar with molecular genetics can probably skip this section. If I become unclear at any point, I suggest referring to this excellent summary of mutations.
Your genome is made up of DNA which is composed of four different nucleotides: adenine (A), thymine (T), guanine (G), and cytosine (C). These nucleotides are arrayed in a linear fashion, much like the words on this page. We can write out the order of nucleotides in a gene like we write out letters of a word (e.g., ACGTACCGT). If we change a particular letter to a different letter (an A changing to a T, for example) we say that a “substitution” has occurred.
The adenine (A) molecule is similar in structure to the guanine (G) molecule (they are referred to as purines), and the thymine (T) molecule is similar to the cytosine (C) molecule (T and C are pyrimidines). Purines are more likely to change to another purine (A => G or G => A) than they are to change to a pyrimidine (A => T, A => C, G => T, or G => C). The same is true for pyrimidines – they are more likely to mutate to another pyrimidine than to a purine. Purine to purine (A <=> G) and pyrimidine to pyrimidine (C <=> T) mutations are referred to as transitions, whereas purine to pyrimidine and pyrimidine to pyrimidine mutations are transversions.
Now that I’ve got you totally mixed up with the jargon of substitutions, I’d like to add a couple more types of mutations to your lexicon. We can add or remove a letter or multiple letters from a particular sequence resulting in an insertion or deletion event (indel). These events can be small (only one or a few nucleotides) or fairly large (an entire gene can be duplicated into another part of the genome resulting in an insertion on the order of thousands of nucleotides). Large deletions are thought to be extremely deleterious, but the fitness cost of smaller indels is still unclear and greatly depends on whether or not they are located in coding sequence or noncoding sequence. Gene duplications resulting in large insertions allow for existing biochemical pathways to evolve new functions and for the appearance of new pathways.
Finally, we will consider a few types of large scale chromosomal mutations. These types of mutations affect the structure of a chromosome or the makeup of the genome in a major way. Chromosomal duplications result in the duplication of a single chromosome, whereas genome duplications result in the duplication of an entire genome. Chromosomal inversions result in a large segment of a chromosome reversing in order. Chromosomal fusions occur when two chromosomes join to form a single chromosome, and fissions are when one chromosome splits into two. A fusion event combined with a fission event is referred to as a chromosomal translocation – a large part of a chromosome is removed and attached to another chromosome. Inversions, fusions, fissions, and translocations do not result in any new information in the genome, but they restructure the existing information which could have important evolutionary implications.
3. The Random Nature of Genetic Mutations
Once you are comfortable with random sampling and probability as well as the nature of genetic mutations, it’s clear what biologists mean when they say, “Mutations are random.” We will start by following a single nucleotide from parent to offspring, and then move on to looking at the entire genome.
Let’s assume the probability of a substitution at a particular nucleotide is 10-9 (a very small number). We will only consider two possible outcomes: substitution (mutation) and no mutation. If you’ve followed me up to this point, you can see that this is analogous to the coin flipping example. We do not know if a particular nucleotide will or will not mutate in one generation, but we do know how likely a mutation event is. Whether or not this nucleotide mutates is a random process, with the probability of one in a billion (10-9) that it does mutate. One out of a billion times that nucleotide will mutate in the process of going from parent to offspring.
This line of thinking can be extended to an entire genome, made up of millions of nucleotides. Each nucleotide has the probability of 10-9 that it will undergo a substitution event in one generation. We can also assign probabilities to other mutational events (indels, duplications, inversions, etc) that can be estimated from natural populations or laboratory experiments. We can use these probabilities to calculate the expected number of mutations in the entire genome going from one generation to the next.
It’s important to understand that when biologists say the mutational process is random, we mean that it is not directed. There is nothing determining definitively that a mutation will occur at a particular nucleotide. Mutations provide the raw material on which natural selection acts. Natural selection is a deterministic process; a beneficial mutation will always reach fixation in an ideal population (i.e., natural selection will cause it to replace all the other alleles), and a deleterious mutation will always be lost. We have no way of saying for sure whether or not a particular nucleotide will mutate because mutation is a random process – we can only assign a probability that it will mutate.