Percent Identity of Genomic DNA and Amino Acid Sequences
Certainty styling is being phased out topic by topic.Hover over keys for definitions:
Percent identity refers to a quantitative measurement of the similarity between two sequences (DNA, amino acid or otherwise). Closely related species are expected to have a higher percent identity for a given sequence than would more distantly related species, and thus percent identity to a degree reflects relatedness. The percent identity of genomic DNA sequence, intron and exon sequence, and amino acid sequence between humans and other species varies by species type, with chimpanzee having the highest percent identity with humans of all species in each category.
Genomic DNA sequence: most estimates of percent identity between humans and chimpanzees put the full genomic percent identity at 98-99%, although estimates as low as 95% have been put forth when including insertions and deletions and a recent study comparing the completed genomes of the two found a 96% identity. Given that many of these studies used a small sample size of each species, it is plausible that the percent identity is underestimated due to individual polymorphisms present in each population. The differences found between species are not distributed evenly across the genome, and chromosome Y, chromosomal ends and CpG dinucleotide repeats show higher divergence than others regions. These estimates of identity are higher than for more distantly related species (93% for old world monkeys, 89% for new world monkeys), but lower than those for intraspecies inter-individual variation.
Amino acid sequence: the percent identity between humans and chimpanzees in amino acid sequence is higher than that for DNA sequence, with estimates over 99%, and it has been proposed that 29% of encoded proteins are identical between the species. When looking at amino acid sequences of particular gene families, though, the similarity may be much lower; human genes with transcription factor activity, for example, have been shown to have nearly 50% more amino acid changes than such genes in chimpanzees.
Introns and exons: Estimates of human-chimpanzee percent identify of introns and exons are 97 and 99, respectively; other estimates give 98.3 percent identity in noncoding regions and >99.5 percent identity in coding regions. The higher similarity in coding regions/exons is consistent with the increased evolutionary selective constraint that would be placed on these protein-coding sequences. These values drop to ~90 percent identity and ~77 percent identity when looking at human and mouse genomes, consistent with the older divergence point between these lineages.
Novel DNA sequence, amino acid sequence, and sequence of introns and exons
Reconstructing the Evolutionary History of Complex Human Gene Clusters, , Research in Computational Molecular Biology: 12th Annual International Conference, RECOMB 2008, Singapore, March 30 - April 2, 2008. Proceedings, Berlin, Heidelberg, p.29–49, (2008)
Global analysis of exon creation versus loss and the role of alternative splicing in 17 vertebrate genomes., , RNA, 05/2007, Volume 13, Issue 5, p.661-70, (2007)
Initial sequence of the chimpanzee genome and comparison with the human genome, , Nature, Sep 1, Volume 437, Number 7055, p.69-87, (2005)
Divergence between samples of chimpanzee and human DNA sequences is 5%, counting indels., , Proc Natl Acad Sci U S A, 2002 Oct 15, Volume 99, Issue 21, p.13633-5, (2002)
Genetic differences between humans and great apes, , Mol Phylogenet Evol, Jan, Volume 18, Number 1, p.2-13, (2001)