| Anonymous | |
|---|---|
| Log in for more site access. | |
![]() | |
Percent Identity of Genomic DNA Sequences, of Introns and Exons, and of Amino Acid Sequences
Percent identity refers to a quantitative measurement of the similarity between two sequences (DNA, amino acid or otherwise). Closely related species are expected to have a higher percent identity for a given sequence than would more distantly related species, and thus percent identity to a degree reflects relatedness. The percent identity of genomic DNA sequence, intron and exon sequence, and amino acid sequence between humans and other species varies by species type, with chimpanzee having the highest percent identity with humans of all species in each category.
Genomic DNA sequence: most estimates of percent identity between humans and chimpanzees put the full genomic percent identity at 98-99%, although estimates as low as 95% have been put forth when including insertions and deletions and a recent study comparing the completed genomes of the two found a 96% identity. Given that many of these studies used a small sample size of each species, it is plausible that the percent identity is underestimated due to individual polymorphisms present in each population. The differences found between species are not distributed evenly across the genome, and chromosome Y, chromosomal ends and CpG dinucleotide repeats show higher divergence than others regions. These estimates of identity are higher than for more distantly related species (93% for old world monkeys, 89% for new world monkeys), but lower than those for intraspecies inter-individual variation.
Amino acid sequence: the percent identity between humans and chimpanzees in amino acid sequence is higher than that for DNA sequence, with estimates over 99%, and it has been proposed that 29% of encoded proteins are identical between the species. When looking at amino acid sequences of particular gene families, though, the similarity may be much lower; human genes with transcription factor activity, for example, have been shown to have nearly 50% more amino acid changes than such genes in chimpanzees.
Introns and exons: Estimates of human-chimpanzee percent identify of introns and exons are 97 and 99, respectively; other estimates give 98.3 percent identity in noncoding regions and >99.5 percent identity in coding regions. The higher similarity in coding regions/exons is consistent with the increased evolutionary selective constraint that would be placed on these protein-coding sequences. These values drop to ~90 percent identity and ~77 percent identity when looking at human and mouse genomes, consistent with the older divergence point between these lineages.
Novel DNA sequence, amino acid sequence, and sequence of introns and exons
Zhang Y, Song G, Binar Tomas et al (2008). Reconstructing the evolutionary history of complex human gene clusters. Proceedings of the 12th Annual International Conference on Research in Computational Molecular Biology.
Alekseyenko AV, Kim N and Lee CJ (2007). Global analysis of exon creation versus loss and the role of alternative splicing in 17 vertebrate genomes. RNA 13(5):661-670.
The Chimpanzee Sequencing and Analysis Consortium (2005). Initial sequence of the chimpanzee genome and comparison with the human genome. Nature 437:69-87.
Britten RJ (2002). Divergence between samples of chimpanzee and human DNA sequences is 5% counting indels. PNAS 99:13633-13635.
Gagneux P and Varki A (2001). Genetic differences between humans and great apes. Molecular Phylogenetics and Evolution 18(1):2-13.

