APOC1 (apolipoprotein C-I)

Certainty Style Key
Hover over keys for definitions:
True   Likely   Speculative
Human Uniqueness Compared to "Great Apes": 
Absolute Difference
MOCA Domain: 
Genetics
MOCA Topic Authors: 

What Proteogenomic Analyses Have Revealed about the apoC-I Genes of Great Apes

D. L. Puppione and N.T. Vu, Molecular Biology Institute, UCLA, Los Angeles, CA 90024


Water insoluble lipids circulate in the blood by being packaged into spheroidal complexes of lipids and proteins, called lipoproteins. The proteins, known as apolipoproteins consist of a series of amphipathic helices with polar amino acids on one side and apolar ones on the other. This enables the proteins to reside on surface of the lipoproteins where one side of the helices interacts with the aqueous environment of the blood and the other with the hydrophobic interior of the lipid core. In addition to their solubilizing properties, most of the apolipoproteins have distinct functions. The various genes for the soluble apolipoproteins are thought to have arisen from a single ancestral gene through series of duplications, insertions and deletions. With the exception of apoA-IV and apoC-IV with genes containing three exons, the genes for the other apolipoproteins have four exons with the first being non-coding. Although the apolipoprotein genes are very similar among great apes, one of the two human genes for apoC-I is an exception.
On chromosome 19 of great apes there are two apoC-I genes; one encodes a positively charged 57 amino acid protein and the other a negatively charged 57 amino acid protein (1). Both genes are located in a cluster that also contains the genes for three other apolipoproteins. In this approximately 45 kb cluster, the five genes with the same transcriptional orientation are in this order: apoE, apoC-IB, apoC-IA, apoC-IV and apoC-II. In other primates as well as other mammals that lack the gene for apoC-IA the cluster contains only four genes. The apolipoproteins encoded by both apoC-I genes are the smallest of the nine soluble apolipoproteins, being approximately 6.6 kDa.
Both humans and chimpanzee have an apoC-IB gene that encodes a basic apolipoprotein, with pI of 7.93. Limited genomic data indicate that this is also true for the other great apes. However, comparing the genes for apoC-IA reveals that humans lack an active gene due to the presence of the stop codon, TAG, in the third exon in the segment encoding the penultimate amino acid of the signal sequence (2). In the other great apes, there is no stop codon and the apoC-IA gene encodes an acidic apolipoprotein. At about the time that it was first reported, the pseudogene was thought to have been formed during gene duplication combined with the insertion of retroelements, primarily ALU sequences, anywhere from 45 to 35 mya (3, 4). The virtual protein encoded by the pseudogene has a pI of 5.87. In spite of the charge differences between the human apoC-I and the virtual protein, alignment of the sequences reveals that out of the 57 amino acids 38 are invariant and 9 are conserved. Because there is only the basic form in humans, the suffix B is not used.
The following is a summary of what is currently known about the two forms of apoC-I and their respective genes in the other great apes (1). Unfortunately, there are major gaps in the intronic and exonic regions in the apolipoprotein gene cluster on chromosome 19 of the great apes. Partial genomic data for apoC-IB exist only for the chimpanzee. Mass spectrometry data have shown that chimpanzee and bonobo apoC-IB have the same molecular masses. There are no genomic data currently for bonobo apolipoprotein genes. The chimpanzee mass data, however, were in agreement with the calculated molecular weight based on the primary sequence derived from genomic data. In the case of the orangutan, the second and third exons of apoC-IB do exist, but not the final one. There are currently extensive gaps in this region of gorilla chromosome 19 and exons of the apoC-IB gene have still to be identified. Molecular mass measurements of both forms of apoC-I will soon be carried out on the orangutan and the gorilla.
There is slightly more information for apoC-IA. Instead of the stop codon, TAG, in the third exon, great apes have CAG, one of the codons for glutamine. The molecular mass values of apoC-IA are the same for the chimpanzee and the bonobo. Top-down sequencing of bonobo apoC-IA by mass spectrometry revealed a primary sequence identical to the chimpanzee apolipoprotein derived from genomic data. A primary sequence can also be obtained from the gorilla and orangutan genes. The chimpanzee, gorilla and orangutan genes all encode an acidic apolipoprotein with pI values of 4.82, 4.65 and 4.79, respectively.
Using the genome browser of the University of California at Santa Cruz to locate the existing coordinates on chromosome 19, it is possible to obtain an estimate on the size of the coding regions for the two apoC-I genes between the start of the second exon and the end of the fourth exon. For the upstream gene of humans and chimpanzees, this region is approximately 4.3 kb. The size of this region for the downstream gene is 4 kb in humans and chimpanzees and 4.4 kb in the orangutan. In the Ensembl database, however, the size of this coding region for orangutan apoC-I is listed as 17 kb. On examination, it is clear to see that this results from combining Exons 2 and 3 of the upstream gene along with the intervening sequences with Exon 4 of the downstream gene. In so doing, Exon 2 and 3 of the downstream gene with starting coordinates of 46168619 and 46157794 respectively, well within the intervening sequences, are ignored. As a result, the orangutan gene for apoC-I as listed is four times larger than either the upstream or the downstream genes. Perhaps this resulted from comparing the orangutan sequences to the apoC-I gene in the human reference sequence and ignoring the pseudogene.
If exons for both apoC-I genes of all the great apes are compared, one finds that the sequences of Exons 2 are at least 97% identical and those of Exon 4 are 93% identical. A high percentage identity also is seen for Exon 3 when the human apoC-I gene is compared with the apoC-IB genes of the chimpanzee and the orangutan, 99% and 96% respectively. Similarly, Exon 3 of the pseudogene has a high percentage identity to this exon in the apoC-IA genes of all the great apes. However, Exon 3 of the human apoC-I gene is only 83% identical to this exon in the pseudogene. But it is these Exon 3 differences that result in the pseudogene and the apoC-IA genes of the other great apes encoding an acidic apolipoproteins with five less positively charged amino acids, 4 lysines and an arginine, that are found in the sequence of the basic form of apoC-I. It remains to be seen if the charge difference between the two forms of apoC-I affects their functions. Their respective functions could very well be different.
Although previously reported that the duplication of the apoC-I gene started between 45 to 35 mya, the studies reviewed above indicate that a functioning genes for both apoC-IB and apoC-IA appeared around the time the orangutan diverged from the human lineage between 16 and 21 mya. As for the conversion of this downstream gene to a pseudogene, it would have occurred after the divergence of the bonobo and the chimpanzee from the human lineage about 6 mya.
As stated above, there are significant gaps in just a relatively small region of chromosome 19. This is truly unfortunate because determining the function of apoC-IA, a protein synthesized in all great apes except humans, might rely in part on genomes completely free of sequence gaps. Such sequence gaps most likely exist in other regions of the genomes of great apes. Genomic studies have already shown that intronic regions are sites of both splicing enhancers and silencers as well as for the interaction of transcription factors (5). Until these gaps are filled, questions regarding the evolution of genes and the differences in gene expression between humans and the other great apes will probably remain unanswered.

References:

1. Puppione, DL et al., Detection of two distinct forms of apoC-I in great apes. Comp. Biochem. Physiol. Part D, 5 (2010) 73-79.

2. Lauer, SJ et al., Two copies of the human apolipoprotein C-I gene are linked closely to the
apolipoprotein E gene. J. Biol. Chem. 263 (1988) 7277–7286.

3. Luo, C et al., Structure and expression of dog apolipoprotein A-I,
E, and C-I mRNAs: implications for the evolution and functional constraints of
apolipoprotein structure. J. Lipid Res. 30 (1989) 1735–1746.

4. Pastorcic, M et al. Baboon apolipoprotein C-I: cDNA and
gene structure and evolution. Genomics 13 (1992) 368–374.

5. Mercado PA et al., Depletion of TDP 43 overrides the need for exonic and intronic splicing enhancers in the human apoA-II gene. Nucleic Acids Res. 33 (2005) 6000-6010.
 

Genetics Topic Attributes

Gene symbols follow the HUGO Gene Nomenclature Committee standard.

Gene Symbol:
apolipoprotein C-IA and apolipoprotein C-IB (apolipoprotein C-IA and apolipoprotein C-IB)
Type of Human-Specific Changes:
Gene Conversion
Related MOCA Topics