Detecting Natural Selection (Part 1)
The Organization of the Genome
This is the second of multiple postings I plan to write about detecting natural selection using molecular data (ie, DNA sequences). The first posting contained a brief introduction and can be found here.
Before we can discuss how DNA sequences are used to identify evidence of natural selection, we must have an understanding of how the genome is organized. The genetic material that can be found in each and every one of your cells – and inside every cell in every living organism – is made up of two sugar-phosphate backbones connected by nitrogenous bases. The two strands are wrapped around each other in what is known as a double helix. The nitrogenous bases come in four flavors: adenine (A), guanine (G), thymine (T), and cytosine (C). The nitrogenous base along with the sugar and phosphate is known as a nucleotide. In the double helix, nucleotides pair – A is always paired with T, and G is always paired with C.
One molecule of DNA goes on for millions of nucleotides, and is known as a chromosome. Along with the DNA, chromosomes also have proteins bound to them that help them wrap up into neat little packages. Humans have 46 chromosomes in everyone one of their cells. Each chromosome has a mate, or homolog, that contains nearly all of the same information, so we divide 46 chromosomes into 23 pairs. For each of the 23 pairs, one copy comes from your mother and one from your father. One of these pairs is a special set known as sex chromosomes (ie, X and Y). Females have two copies of the X chromosome (one from their mother and one from their father), whereas males have one X (from mom) and one Y (from dad). The National Center for Biotechnology Information has created a nifty genome browser that lets you explore the contents of each chromosome.
Some of the sequences of nucleotides contain information that leads to the production of proteins. These proteins carry out essential functions within the cell (such as the production of more proteins), allow for communication between cells, and regulate cell division, among the many other tasks they perform. The sequences containing the information to a produce a protein are known as coding sequences. (Note: some people refer to them as genes, but genes may also include sequences that encode RNAs that are never translated into proteins.) Each chromosome contains hundreds of coding sequences interspersed throughout the length of the DNA molecule. In between these coding sequences are non-coding sequences of nucleotides. These non-coding sequences may contain information that determines when and how the coding sequences are translated into proteins.
Here’s a little drawing showing two coding sequences separated by a non-coding sequence.
That’s it for now. Next time we’ll talk about the organization of coding sequences and their regulatory regions. I promise we’ll be discussing natural selection soon.