Are Deletions Deleterious? Part 1
Large scale deletions have been used by geneticists for a long time. A Drosophila geneticist can order a line (a small population of nearly identical flies characterized by some genetic similarity) carrying a deletion for practically any segment on any chromosome (this is only true for D. melanogaster). These so called “deficiency” stocks are useful as they can be used to expose recessive mutations for a particular region of the chromosome (creating hemizygous individuals). They are also used to map the genomic location of recessive mutations, in a process known as deficiency mapping. Overlapping deletions either expose the mutant phenotype or do not, allowing the geneticist to determine the section of the chromosome containing the mutated gene.
Large scale deletions in Drosophila are assumed to be deleterious because they eliminate multiple genes within the deleted region; flies homozygous for these deficiencies probably do not survive to adulthood, and if they do, they most likely suffer major fitness costs. What about smaller deletions? We can assume that the deletion of a coding region (either partial or complete) is probably not appreciated by the organism if the deleted sequence encodes some important protein. Also, if the deletion leads to a frameshift, the encoded protein will be dramatically changed. Many non-coding sequences contain important regulatory elements, the deletion of which is probably ill advised if they control the expression of a vital gene. But many eukaryotic genomes (including all of the mammalian genomes studied) have huge chunks of non-coding DNA, much of it probably not involved in the regulating gene expression.
If we examine a genome, we can identify deletions and determine whether they are more common in coding or non-coding regions. With the influx of whole genome sequences, much has been made about copy number variation in humans. Variation in copy number can be due to a duplication in one genome or a deletion in the other genome. If, however, one genome has a single copy of a sequence and the other genome has no copies, deletion in one genome seems like a very likely scenario. Three papers published in Nature Genetics report the identification of polymorphic deletions in human populations and some analysis of their distribution (for a review of these articles, go here).
David Altshuler’s group (along with the HapMap Consortium) present a single nucleotide polymorphism (SNP) based approach for identifying deletions. Their design is quite elegant and is based on the expected relationship of allele and genotype frequencies at neutral loci. The observed genotype frequencies of a particular SNP in a population can be used to determine the allele frequencies. We can then use the allele frequencies to calculate the expected genotype frequencies under Hardy-Weinberg equilibrium. If the observed genotype frequencies deviate from the expected frequencies, we have too few individuals with a particular genotype and too many with some other genotype. In the case of polymorphic deletions, an individual that is hemizygous (has only one copy of a locus) will appear to be homozygous when a SNP within that deletion is genotyped. If there are an excess of apparent homozygotes (based on the SNP data) clustered in a particular region, we have reason to suspect that there is a deletion of that region segregating in the population.
They identified 541 deletions (507 of which had not been identified previously) ranging in size from one to 745 kilobases long (with an average of 7.0 kb). They confirmed five of the larger deletions using in situ hybridization to chromosomes, and they tested 60 using PCR (51/60 were confirmed). In total, the deletions contained 266 genes that were either partially or entirely located within deleted regions. They used the expression of these genes to determine whether an individual was homozygous for both copies (wild type, normal level of expression), homozygous for the deletion (no expression), heterozygous/hemizygous (half the expected level of expression), and they found that the deletions were inherited according to Mendelian expectations.
Many deletions are associated with diseases and cancers, but others are probably not very deleterious as they are at appreciable frequencies in human populations. While we can expect that deletions that fail to remove coding sequences may be common, it seems surprising to find that some gene deletions are also found in many individuals (additionally, they appear quite old as they are found in different populations). This begs the question, are all of our genes necessary? Do we have certain expendable coding sequences that we can live without? Maybe they were of important at some time in our evolutionary history, but they are no longer needed. Whatever the case may be, there is no single human genome in terms of sequence or structure, only a rough sketch that we all follow, with our own little quirks and idiosyncrasies (this is true for all other species as well). This, of course, leads us to question how much of our phenotypic uniqueness is due to genetic differences, something I am neither qualified nor prepared to discuss.
In a subsequent post I will describe the other two papers.
Conrad DF, Andrews TD, Carter NP, Hurles ME, Pritchard JK. 2006. A high-resolution survey of deletion polymorphism in the human genome. Nat Genet. 38:75-81
Hinds DA, Kloek AP, Jen M, Chen X, Frazer KA. 2006. Common deletions and SNPs are in linkage disequilibrium in the human genome. Nat Genet. 38:82-85
McCarroll SA, Hadnott TN, Perry GH, Sabeti PC, Zody MC, Barrett JC, Dallaire S, Gabriel SB Lee C, Daly MJ, Altshuler DM, & The International HapMap Consortium. 2006. Common deletion polyrmorphisms in the human genome. Nat Genet. 38: 86-92.