Principles of long noncoding RNA evolution derived from direct comparison of transcriptomes in 17 species.

Bibliographic Collection: 
Publication Type: Journal Article
Authors: Hezroni, Hadas; Koppstein, David; Schwartz, Matthew G; Avrutin, Alexandra; Bartel, David P; Ulitsky, Igor
Year of Publication: 2015
Journal: Cell Rep
Volume: 11
Issue: 7
Pagination: 1110-22
Date Published: 2015 May 19
Publication Language: eng
ISSN: 2211-1247
Keywords: Base Sequence, Conserved Sequence, Evolution, Molecular, Humans, RNA, Long Noncoding, Sequence Analysis, RNA, Transcriptome

The inability to predict long noncoding RNAs from genomic sequence has impeded the use of comparative genomics for studying their biology. Here, we develop methods that use RNA sequencing (RNA-seq) data to annotate the transcriptomes of 16 vertebrates and the echinoid sea urchin, uncovering thousands of previously unannotated genes, most of which produce long intervening noncoding RNAs (lincRNAs). Although in each species, >70% of lincRNAs cannot be traced to homologs in species that diverged >50 million years ago, thousands of human lincRNAs have homologs with similar expression patterns in other species. These homologs share short, 5'-biased patches of sequence conservation nested in exonic architectures that have been extensively rewired, in part by transposable element exonization. Thus, over a thousand human lincRNAs are likely to have conserved functions in mammals, and hundreds beyond mammals, but those functions require only short patches of specific sequences and can tolerate major changes in gene architecture.

DOI: 10.1016/j.celrep.2015.04.023
Alternate Journal: Cell Rep