Long interspersed nuclear element
Long interspersed nuclear elements (LINEs)[1] (also known as long interspersed nucleotide elements[2] or long interspersed elements[3]) are a group of non-LTR (long terminal repeat) retrotransposons that are widespread in the genome of many eukaryotes.[4][5] LINEs contain an internal Pol II promoter to initiate transcription into mRNA, and encode one or two proteins, ORF1 and ORF2.[6] The functional domains present within ORF1 vary greatly among LINEs, but often exhibit RNA/DNA binding activity. ORF2 is essential to successful retrotransposition, and encodes a protein with both reverse transcriptase and endonuclease activity.[7]
LINEs are the most abundant transposable element within the human genome,[8] with approximately 20.7% of the sequences identified as being derived from LINEs. The only active lineage of LINE found within humans belongs to the LINE-1 class, and is referred to as L1Hs.[9] The human genome contains an estimated 100,000 truncated and 4,000 full-length LINE-1 elements.[10] Due to the accumulation of random mutations, the sequence of many LINEs has degenerated to the extent that they are no longer transcribed or translated. Comparisons of LINE DNA sequences can be used to date transposon insertions in the genome.
History of discovery
The first description of an approximately 6.4 kb long LINE-derived sequence was published by J. Adams et al. in 1980.[11]
Classification of LINEs
Based on structural features and the phylogeny of the essential protein ORF2p, LINEs can be separated into six main groups, referred to as R2, RanI, L1, RTE, I and Jockey. These groups can further be subdivided into at least 28 clades.[12]
In plant genomes, so far only LINEs of the L1 and RTE clade have been reported.[13][14][15] Whereas L1 elements diversify into several subclades, RTE-type LINEs are highly conserved, often constituting a single family.[16][17]
In fungi, Tad, L1, CRE, Deceiver and Inkcap-like elements have been identified,[18] with Tad-like elements appearing exclusively in fungal genomes.[19]
All LINEs encode a least one protein, ORF2, which contains an RT and an endonuclease (EN) domain, either an N-terminal APE or a C-terminal RLE or rarely both. A ribonuclease H domain is occasionally present. Except for the evolutionary ancient R2 and RTE superfamilies, LINEs usually encode for another protein named ORF1, which may contain an Gag-knuckle, a L1-like RRM (InterPro: IPR035300), and/or an esterase. LINE elements are relatively rare compared to LTR-retrotransposons in plants, fungi or insects, but are dominant in vertebrates and especially in mammals, where they represent around 20% of the genome.[12]: fig. 1
L1 elements
The LINE-1/L1-element is one of the elements that are still active in the human genome today. It is found in all therian mammals[20][21] except megabats.[22]
Other elements
Remnants of L2 and L3 elements are found in the human genome.[23] It is estimated that L2 and L3 elements were active ~200-300 million years ago. Due to the age of L2 elements found within therian genomes, they lack flanking target site duplications.[24] The L2 (and L3) elements are in the same group as the CR1 clade, Jockey.[25]
Incidence
In human
In the first human genome draft the fraction of LINE elements of the human genome was given as 21% and their copy number as 850,000. Of these, L1, L2 and L3 elements made up 516,000, 315,000 and 37,000 copies, respectively. The non-autonomous SINE elements which depend on L1 elements for their proliferation make up 13% of the human genome and have a copy number of around 1.5 million.[23] They probably originated from the RTE family of LINEs.[26] Recent estimates show the typical human genome contains on average 100 L1 elements with potential for mobilization, however there is a fair amount of variation and some individuals may contain a larger number of active L1 elements, making these individuals more prone to L1-induced mutagenesis.[27]
Increased L1 copy numbers have also been found in the brains of people with schizophrenia, indicating that LINE elements may play a role in some neuronal diseases.[28] Using long-read sequencing approaches can significantly improve the detection of L1Hs insertions present in individual genomes compared the to standard short-read sequencing approaches.[29]
Propagation
LINE elements propagate by a so-called target primed reverse transcription mechanism (TPRT), which was first described for the R2 element from the silkworm Bombyx mori.
ORF2 (and ORF1 when present) proteins primarily associate in cis with their encoding mRNA, forming a ribonucleoprotein (RNP) complex, likely composed of two ORF2s and an unknown number of ORF1 trimers.[30] The complex is transported back into the nucleus, where the ORF2 endonuclease domain opens the DNA (at TTAAAA hexanucleotide motifs in mammals[31]). Thus, a 3'OH group is freed for the reverse transcriptase to prime reverse transcription of the LINE RNA transcript. Following the reverse transcription the target strand is cleaved and the newly created cDNA is integrated[32]
New insertions create short target site duplications (TSDs), and the majority of new inserts are severely 5’-truncated (average insert size of 900bp in humans) and often inverted (Szak et al., 2002). Because they lack their 5’UTR, most of new inserts are non functional.
Regulation of LINE activity
It has been shown that host cells regulate L1 retrotransposition activity, for example through epigenetic silencing. For example, the RNA interference (RNAi) mechanism of small interfering RNAs derived from L1 sequences can cause suppression of L1 retrotransposition.[33]
In plant genomes, epigenetic modification of LINEs can lead to expression changes of nearby genes and even to phenotypic changes: In the oil palm genome, methylation of a Karma-type LINE underlies the somaclonal, 'mantled' variant of this plant, responsible for drastic yield loss.[34]
Human APOBEC3C mediated restriction of LINE-1 elements were reported and it is due to the interaction between A3C with the ORF1p that affects the reverse transcriptase activity.[35]
Association with disease
A historic example of L1-conferred disease is Haemophilia A, which is caused by insertional mutagenesis.[36] There are nearly 100 examples of known diseases caused by retroelement insertions, including some types of cancer and neurological disorders.[37] Correlation between L1 mobilization and oncogenesis has been reported for epithelial cell cancer (carcinoma).[38] Hypomethylation of LINES is associated with chromosomal instability and altered gene expression[39] and is found in various cancer cell types in various tissues types.[40][39] Hypomethylation of a specific L1 located in the MET onco gene is associated with bladder cancer tumorogenesis,[41] Shift work sleep disorder[42] is associated with increased cancer risk because light exposure at night reduces melatonin, a hormone that has been shown to reduce L1-induced genome instability.[43]
References
- Ewing AD, Kazazian HH (June 2011). "Whole-genome resequencing allows detection of many rare LINE-1 insertion alleles in humans". Genome Research. 21 (6): 985–990. doi:10.1101/gr.114777.110. PMC 3106331. PMID 20980553.
- Huang X, Su G, Wang Z, Shangguan S, Cui X, Zhu J, et al. (March 2014). "Hypomethylation of long interspersed nucleotide element-1 in peripheral mononuclear cells of juvenile systemic lupus erythematosus patients in China". International Journal of Rheumatic Diseases. 17 (3): 280–290. doi:10.1111/1756-185X.12239. PMID 24330152. S2CID 6530689.
- Rodić N, Burns KH (March 2013). "Long interspersed element-1 (LINE-1): passenger or driver in human neoplasms?". PLOS Genetics. 9 (3): e1003402. doi:10.1371/journal.pgen.1003402. PMC 3610623. PMID 23555307.
- Singer MF (March 1982). "SINEs and LINEs: highly repeated short and long interspersed sequences in mammalian genomes". Cell. 28 (3): 433–434. doi:10.1016/0092-8674(82)90194-5. PMID 6280868. S2CID 22129236.
- Jurka J (June 1998). "Repeats in genomic DNA: mining and meaning". Current Opinion in Structural Biology. 8 (3): 333–337. doi:10.1016/S0959-440X(98)80067-5. PMID 9666329.
- Feng Q, Moran JV, Kazazian HH, Boeke JD (November 1996). "Human L1 retrotransposon encodes a conserved endonuclease required for retrotransposition". Cell. 87 (5): 905–916. doi:10.1016/s0092-8674(00)81997-2. PMID 8945517. S2CID 17897241.
- Eickbush TH, Jamburuthugoda VK (June 2008). "The diversity of retrotransposons and the properties of their reverse transcriptases". Virus Research. 134 (1–2): 221–234. doi:10.1016/j.virusres.2007.12.010. PMC 2695964. PMID 18261821.
- Nurk S, Koren S, Rhie A, Rautiainen M, Bzikadze AV, Mikheenko A, et al. (April 2022). "The complete sequence of a human genome". Science. 376 (6588): 44–53. Bibcode:2022Sci...376...44N. doi:10.1126/science.abj6987. PMC 9186530. PMID 35357919.
- McMillan JP, Singer MF (December 1993). "Translation of the human LINE-1 element, L1Hs". Proceedings of the National Academy of Sciences of the United States of America. 90 (24): 11533–11537. Bibcode:1993PNAS...9011533M. doi:10.1073/pnas.90.24.11533. PMC 48018. PMID 8265584.
- Sheen FM, Sherry ST, Risch GM, Robichaux M, Nasidze I, Stoneking M, et al. (October 2000). "Reading between the LINEs: human genomic variation induced by LINE-1 retrotransposition". Genome Research. 10 (10): 1496–1508. doi:10.1101/gr.149400. PMC 310943. PMID 11042149.
- Adams JW, Kaufman RE, Kretschmer PJ, Harrison M, Nienhuis AW (December 1980). "A family of long reiterated DNA sequences, one copy of which is next to the human beta globin gene". Nucleic Acids Research. 8 (24): 6113–6128. doi:10.1093/nar/8.24.6113. PMC 328076. PMID 6258162.
- Kapitonov VV, Tempel S, Jurka J (December 2009). "Simple and fast classification of non-LTR retrotransposons based on phylogeny of their RT domain protein sequences". Gene. 448 (2): 207–213. doi:10.1016/j.gene.2009.07.019. PMC 2829327. PMID 19651192.
- Heitkam T, Schmidt T (September 2009). "BNR - a LINE family from Beta vulgaris - contains a RRM domain in open reading frame 1 and defines a L1 sub-clade present in diverse plant genomes". The Plant Journal. 59 (6): 872–882. doi:10.1111/j.1365-313x.2009.03923.x. PMID 19473321.
- Zupunski V, Gubensek F, Kordis D (October 2001). "Evolutionary dynamics and evolutionary history in the RTE clade of non-LTR retrotransposons". Molecular Biology and Evolution. 18 (10): 1849–1863. doi:10.1093/oxfordjournals.molbev.a003727. PMID 11557792.
- Komatsu M, Shimamoto K, Kyozuka J (August 2003). "Two-step regulation and continuous retrotransposition of the rice LINE-type retrotransposon Karma". The Plant Cell. 15 (8): 1934–1944. doi:10.1105/tpc.011809. PMC 167180. PMID 12897263.
- Heitkam T, Holtgräwe D, Dohm JC, Minoche AE, Himmelbauer H, Weisshaar B, Schmidt T (August 2014). "Profiling of extensively diversified plant LINEs reveals distinct plant-specific subclades". The Plant Journal. 79 (3): 385–397. doi:10.1111/tpj.12565. PMID 24862340.
- Smyshlyaev G, Voigt F, Blinov A, Barabas O, Novikova O (December 2013). "Acquisition of an Archaea-like ribonuclease H domain by plant L1 retrotransposons supports modular evolution". Proceedings of the National Academy of Sciences of the United States of America. 110 (50): 20140–20145. Bibcode:2013PNAS..11020140S. doi:10.1073/pnas.1310958110. PMC 3864347. PMID 24277848.
- Novikova O, Fet V, Blinov A (February 2009). "Non-LTR retrotransposons in fungi". Functional & Integrative Genomics. 9 (1): 27–42. doi:10.1007/s10142-008-0093-8. PMID 18677522. S2CID 23319640.
- Malik HS, Burke WD, Eickbush TH (June 1999). "The age and evolution of non-LTR retrotransposable elements". Molecular Biology and Evolution. 16 (6): 793–805. doi:10.1093/oxfordjournals.molbev.a026164. PMID 10368957.
- Warren WC, Hillier LW, Marshall Graves JA, Birney E, Ponting CP, Grützner F, et al. (May 2008). "Genome analysis of the platypus reveals unique signatures of evolution". Nature. 453 (7192): 175–183. Bibcode:2008Natur.453..175W. doi:10.1038/nature06936. PMC 2803040. PMID 18464734.
- Ivancevic AM, Kortschak RD, Bertozzi T, Adelson DL (July 2018). "Horizontal transfer of BovB and L1 retrotransposons in eukaryotes". Genome Biology. 19 (1): 85. doi:10.1186/s13059-018-1456-7. PMC 6036668. PMID 29983116.
- Smith JD, Gregory TR (June 2009). "The genome sizes of megabats (Chiroptera: Pteropodidae) are remarkably constrained". Biology Letters. 5 (3): 347–351. doi:10.1098/rsbl.2009.0016. PMC 2679926. PMID 19324635.
- Lander ES, Linton LM, Birren B, Nusbaum C, Zody MC, Baldwin J, et al. (February 2001). "Initial sequencing and analysis of the human genome". Nature. 409 (6822): 860–921. Bibcode:2001Natur.409..860L. doi:10.1038/35057062. hdl:2027.42/62798. PMID 11237011.
- Kapitonov VV, Pavlicek A, Jurka J (January 2006). "Anthology of Human Repetitive DNA". Encyclopedia of Molecular Cell Biology and Molecular Medicine. Wiley-VCH Verlag GmbH & Co. KGaA. doi:10.1002/3527600906.mcb.200300166. ISBN 9783527600908.
- Lovsin N, Gubensek F, Kordi D (December 2001). "Evolutionary dynamics in a novel L2 clade of non-LTR retrotransposons in Deuterostomia". Molecular Biology and Evolution. 18 (12): 2213–2224. doi:10.1093/oxfordjournals.molbev.a003768. PMID 11719571.
- Malik HS, Eickbush TH (September 1998). "The RTE class of non-LTR retrotransposons is widely distributed in animals and is the origin of many SINEs". Molecular Biology and Evolution. 15 (9): 1123–1134. doi:10.1093/oxfordjournals.molbev.a026020. PMID 9729877.
- Streva VA, Jordan VE, Linker S, Hedges DJ, Batzer MA, Deininger PL (March 2015). "Sequencing, identification and mapping of primed L1 elements (SIMPLE) reveals significant variation in full length L1 elements between individuals". BMC Genomics. 16 (1): 220. doi:10.1186/s12864-015-1374-y. PMC 4381410. PMID 25887476.
- Bundo M, Toyoshima M, Okada Y, Akamatsu W, Ueda J, Nemoto-Miyauchi T, et al. (January 2014). "Increased l1 retrotransposition in the neuronal genome in schizophrenia". Neuron. 81 (2): 306–313. doi:10.1016/j.neuron.2013.10.053. PMID 24389010.
- Zhou, Weichen; Emery, Sarah B.; Flasch, Diane A.; Wang, Yifan; Kwan, Kenneth Y.; Kidd, Jeffrey M.; Moran, John V.; Mills, Ryan E. (2020-02-20). "Identification and characterization of occult human-specific LINE-1 insertions using long-read sequencing technology". Nucleic Acids Research. 48 (3): 1146–1163. doi:10.1093/nar/gkz1173. ISSN 1362-4962. PMC 7026601. PMID 31853540.
- Babushok DV, Ostertag EM, Courtney CE, Choi JM, Kazazian HH (February 2006). "L1 integration in a transgenic mouse model". Genome Research. 16 (2): 240–250. doi:10.1101/gr.4571606. PMC 1361720. PMID 16365384.
- Jurka J (March 1997). "Sequence patterns indicate an enzymatic involvement in integration of mammalian retroposons". Proceedings of the National Academy of Sciences of the United States of America. 94 (5): 1872–1877. Bibcode:1997PNAS...94.1872J. doi:10.1073/pnas.94.5.1872. PMC 20010. PMID 9050872.
- Luan DD, Korman MH, Jakubczak JL, Eickbush TH (February 1993). "Reverse transcription of R2Bm RNA is primed by a nick at the chromosomal target site: a mechanism for non-LTR retrotransposition". Cell. 72 (4): 595–605. doi:10.1016/0092-8674(93)90078-5. PMID 7679954. S2CID 42587840.
- Yang N, Kazazian HH (September 2006). "L1 retrotransposition is suppressed by endogenously encoded small interfering RNAs in human cultured cells". Nature Structural & Molecular Biology. 13 (9): 763–771. doi:10.1038/nsmb1141. PMID 16936727. S2CID 32601334.
- Ong-Abdullah M, Ordway JM, Jiang N, Ooi SE, Kok SY, Sarpan N, et al. (September 2015). "Loss of Karma transposon methylation underlies the mantled somaclonal variant of oil palm". Nature. 525 (7570): 533–537. Bibcode:2015Natur.525..533O. doi:10.1038/nature15365. PMC 4857894. PMID 26352475.
- Horn AV, Klawitter S, Held U, Berger A, Vasudevan AA, Bock A, et al. (January 2014). "Human LINE-1 restriction by APOBEC3C is deaminase independent and mediated by an ORF1p interaction that affects LINE reverse transcriptase activity". Nucleic Acids Research. 42 (1): 396–416. doi:10.1093/nar/gkt898. PMC 3874205. PMID 24101588.
- Kazazian HH, Wong C, Youssoufian H, Scott AF, Phillips DG, Antonarakis SE (March 1988). "Haemophilia A resulting from de novo insertion of L1 sequences represents a novel mechanism for mutation in man". Nature. 332 (6160): 164–166. Bibcode:1988Natur.332..164K. doi:10.1038/332164a0. PMID 2831458. S2CID 4259071.
- Solyom S, Kazazian HH (February 2012). "Mobile elements in the human genome: implications for disease". Genome Medicine. 4 (2): 12. doi:10.1186/gm311. PMC 3392758. PMID 22364178.
- Carreira PE, Richardson SR, Faulkner GJ (January 2014). "L1 retrotransposons, cancer stem cells and oncogenesis". The FEBS Journal. 281 (1): 63–73. doi:10.1111/febs.12601. PMC 4160015. PMID 24286172.
- Kitkumthorn N, Mutirangura A (August 2011). "Long interspersed nuclear element-1 hypomethylation in cancer: biology and clinical applications". Clinical Epigenetics. 2 (2): 315–330. doi:10.1007/s13148-011-0032-8. PMC 3365388. PMID 22704344.
- Estécio MR, Gharibyan V, Shen L, Ibrahim AE, Doshi K, He R, et al. (May 2007). "LINE-1 hypomethylation in cancer is highly variable and inversely correlated with microsatellite instability". PLOS ONE. 2 (5): e399. Bibcode:2007PLoSO...2..399E. doi:10.1371/journal.pone.0000399. PMC 1851990. PMID 17476321.
- Wolff EM, Byun HM, Han HF, Sharma S, Nichols PW, Siegmund KD, et al. (April 2010). "Hypomethylation of a LINE-1 promoter activates an alternate transcript of the MET oncogene in bladders with cancer". PLOS Genetics. 6 (4): e1000917. doi:10.1371/journal.pgen.1000917. PMC 2858672. PMID 20421991.
- Spadafora C (April 2015). "A LINE-1-encoded reverse transcriptase-dependent regulatory mechanism is active in embryogenesis and tumorigenesis". Annals of the New York Academy of Sciences. 1341 (1): 164–171. Bibcode:2015NYASA1341..164S. doi:10.1111/nyas.12637. PMID 25586649. S2CID 22881053.
- deHaro D, Kines KJ, Sokolowski M, Dauchy RT, Streva VA, Hill SM, et al. (July 2014). "Regulation of L1 expression and retrotransposition by melatonin and its receptor: implications for cancer risk associated with light exposure at night". Nucleic Acids Research. 42 (12): 7694–7707. doi:10.1093/nar/gku503. PMC 4081101. PMID 24914052.