Detecting Natural Selection (Part 5)
Allele and Genotype Frequencies in Populations
This is the sixth of multiple postings I plan to write about detecting natural selection using molecular data (ie, DNA sequences). The first post contained a brief introduction and can be found here. The second post described the organization of the genome, and the third described the organization of genes. The fourth post described codon based models for detecting selection, and the fifth detailed how relative rates can be used to detect changes in selective pressure
The previous analytical techniques we have discussed all deal with comparing sequences from different species (or homologous sequences that resulted from gene duplication). From here on out we will be discussing how variation within a species can be used to detect natural selection at the sequence level. To do this we must first address what we expect when there is no selection. This post deals with expected allele and genotype frequencies and changes in allele frequencies between generations (something I have written about before). Subsequent posts will use more recently developed analyses that allow us to detect selection by sampling allele frequencies in a population.
| A | a |
A | AA | Aa |
a | Aa | aa |
| p | q |
p | p2 | pq |
q | pq | q2 |
- Freq(AA) = p2
- Freq(Aa) = 2pq
- Freq(aa) = q2
We can then use these formulas to determine the allele frequencies in the second generation. Let p' and q' be the allele frequencies of A and a in the second generation, such that:
- p' = p2 + pq
- q' = q2 + pq
p' = p(p + q)
As mentioned above, p+q =1, so p' = p, and we have proof that random mating alone does not alter allele frequencies (the frequency of allele a does not change because q' = 1-p', which is equivalent to q'=q).
Natural selection, however, can lead to changes in allele frequencies between generations. I detailed how to determine the expected allele frequencies after selection given the allele frequencies before selection and the fitness of the different genotypes in my post on mean fitness and genetic load. We can also derive the marginal fitness of the alleles (remember, fitness is a measure of the number of progeny left per individual carrying a particular genotype, and in diploid organisms the genotype consists of two alleles), and we get the following results:
- WA = pWAA + qWAa
- Wa = qWaa + pWAa
where WA and Wa are the marginal fitnesses of alleles A and a, and WAA, WAa, and Waa are the fitness of each of the genotypes (the number of progeny left by an individual carrying that genotype). As you can see, by measuring changes in genotype frequencies from generation to generation we can estimate the fitness of each genotype, and by measuring changes in allele frequencies we can estimate the marginal fitness of each allele.
These results provide the theoretical framework for all of population genetics, but they are rarely used to detect selection because more powerful techniques have been developed for molecular data. I still felt it necessary to lay out some of these concepts for you so that you can appreciate what will follow: detecting natural selection using nucleotide sequence polymorphism.
1 Comments:
Non-random mating (such as inbreeding) will only affect genotype frequencies (inbreeding will lead to excess of homozygotes); this is detectable by comparing the observed genotype frequencies with those expected based on the observed allele frequencies. Natural selection will usually lead to changes of allele frequencies from one generation to the next.
Post a Comment
<< Home