11
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Inpactor, Integrated and Parallel Analyzer and Classifier of LTR Retrotransposons and Its Application for Pineapple LTR Retrotransposons Diversity and Dynamics

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          One particular class of Transposable Elements (TEs), called Long Terminal Repeats (LTRs), retrotransposons, comprises the most abundant mobile elements in plant genomes. Their copy number can vary from several hundreds to up to a few million copies per genome, deeply affecting genome organization and function. The detailed classification of LTR retrotransposons is an essential step to precisely understand their effect at the genome level, but remains challenging in large-sized genomes, requiring the use of optimized bioinformatics tools that can take advantage of supercomputers. Here, we propose a new tool: Inpactor, a parallel and scalable pipeline designed to classify LTR retrotransposons, to identify autonomous and non-autonomous elements, to perform RT-based phylogenetic trees and to analyze their insertion times using High Performance Computing (HPC) techniques. Inpactor was tested on the classification and annotation of LTR retrotransposons in pineapple, a recently-sequenced genome. The pineapple genome assembly comprises 44% of transposable elements, of which 23% were classified as LTR retrotransposons. Exceptionally, 16.4% of the pineapple genome assembly corresponded to only one lineage of the Gypsy superfamily: Del, suggesting that this particular lineage has undergone a significant increase in its copy numbers. As demonstrated for the pineapple genome, Inpactor provides comprehensive data of LTR retrotransposons’ classification and dynamics, allowing a fine understanding of their contribution to genome structure and evolution. Inpactor is available at https://github.com/simonorozcoarias/Inpactor.

          Related collections

          Most cited references44

          • Record: found
          • Abstract: found
          • Article: not found

          The Sorghum bicolor genome and the diversification of grasses.

          Sorghum, an African grass related to sugar cane and maize, is grown for food, feed, fibre and fuel. We present an initial analysis of the approximately 730-megabase Sorghum bicolor (L.) Moench genome, placing approximately 98% of genes in their chromosomal context using whole-genome shotgun sequence validated by genetic, physical and syntenic information. Genetic recombination is largely confined to about one-third of the sorghum genome with gene order and density similar to those of rice. Retrotransposon accumulation in recombinationally recalcitrant heterochromatin explains the approximately 75% larger genome size of sorghum compared with rice. Although gene and repetitive DNA distributions have been preserved since palaeopolyploidization approximately 70 million years ago, most duplicated gene sets lost one member before the sorghum-rice divergence. Concerted evolution makes one duplicated chromosomal segment appear to be only a few million years old. About 24% of genes are grass-specific and 7% are sorghum-specific. Recent gene and microRNA duplications may contribute to sorghum's drought tolerance.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons

            Background Transposable elements are abundant in eukaryotic genomes and it is believed that they have a significant impact on the evolution of gene and chromosome structure. While there are several completed eukaryotic genome projects, there are only few high quality genome wide annotations of transposable elements. Therefore, there is a considerable demand for computational identification of transposable elements. LTR retrotransposons, an important subclass of transposable elements, are well suited for computational identification, as they contain long terminal repeats (LTRs). Results We have developed a software tool LTRharvest for the de novo detection of full length LTR retrotransposons in large sequence sets. LTRharvest efficiently delivers high quality annotations based on known LTR transposon features like length, distance, and sequence motifs. A quality validation of LTRharvest against a gold standard annotation for Saccharomyces cerevisae and Drosophila melanogaster shows a sensitivity of up to 90% and 97% and specificity of 100% and 72%, respectively. This is comparable or slightly better than annotations for previous software tools. The main advantage of LTRharvest over previous tools is (a) its ability to efficiently handle large datasets from finished or unfinished genome projects, (b) its flexibility in incorporating known sequence features into the prediction, and (c) its availability as an open source software. Conclusion LTRharvest is an efficient software tool delivering high quality annotation of LTR retrotransposons. It can, for example, process the largest human chromosome in approx. 8 minutes on a Linux PC with 4 GB of memory. Its flexibility and small space and run-time requirements makes LTRharvest a very competitive candidate for future LTR retrotransposon annotation projects. Moreover, the structured design and implementation and the availability as open source provides an excellent base for incorporating novel concepts to further improve prediction of LTR retrotransposons.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              LTR_retriever: a highly accurate and sensitive program for identification of long terminal-repeat retrotransposons

              Long terminal repeat retrotransposons (LTR-RTs) are prevalent in plant genomes. The identification of LTR-RTs is critical for achieving high-quality gene annotation. Based on the well-conserved structure, multiple programs were developed for the de novo identification of LTR-RTs; however, these programs are associated with low specificity and high false discovery rates. Here, we report LTR_retriever, a multithreading-empowered Perl program that identifies LTR-RTs and generates high-quality LTR libraries from genomic sequences. LTR_retriever demonstrated significant improvements by achieving high levels of sensitivity (91%), specificity (97%), accuracy (96%), and precision (90%) in rice (Oryza sativa). LTR_retriever is also compatible with long sequencing reads. With 40k self-corrected PacBio reads equivalent to 4.5× genome coverage in Arabidopsis (Arabidopsis thaliana), the constructed LTR library showed excellent sensitivity and specificity. In addition to canonical LTR-RTs with 5'-TG…CA-3' termini, LTR_retriever also identifies noncanonical LTR-RTs (non-TGCA), which have been largely ignored in genome-wide studies. We identified seven types of noncanonical LTRs from 42 out of 50 plant genomes. The majority of noncanonical LTRs are Copia elements, with which the LTR is four times shorter than that of other Copia elements, which may be a result of their target specificity. Strikingly, non-TGCA Copia elements are often located in genic regions and preferentially insert nearby or within genes, indicating their impact on the evolution of genes and their potential as mutagenesis tools.
                Bookmark

                Author and article information

                Journal
                Biology (Basel)
                Biology (Basel)
                biology
                Biology
                MDPI
                2079-7737
                25 May 2018
                June 2018
                : 7
                : 2
                : 32
                Affiliations
                [1 ]Department of Electronics and Automatization, Universidad Autónoma de Manizales, Manizales 170002, Colombia; simon.orozco.arias@ 123456gmail.com (S.O.-A.); rtabares@ 123456autonoma.edu.co (R.T.-S.)
                [2 ]FAFU and UIUC-SIB Joint Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, Fuzhou 350002, China; relaxljliu@ 123456sina.com (J.L.); rming@ 123456life.uiuc.edu (R.M.)
                [3 ]Department of Systems and Informatics, Universidad de Caldas, Manizales 170002, Colombia; diego.ceballos@ 123456ucaldas.edu.co
                [4 ]Department of Botany, Instituto de Biociências, Universidade Estadual Paulista, UNESP, Rio Claro, SP 13506-900, Brazil; doug@ 123456rc.unesp.br
                [5 ]Department of Biological Sciences, Universidad de Caldas, Manizales 170002, Colombia; neagef@ 123456gmail.com
                [6 ]Department of Plant Biology, University of Illinois at Urbana-Champaign, Champaign, IL 61801, USA
                [7 ]Institut de Recherche pour le Développement (IRD), CIRAD, Université de Montpellier, Montpellier 34394, France
                Author notes
                [* ]Correspondence: romain.guyot@ 123456ird.fr
                [†]

                These authors contributed equally to this work.

                Author information
                https://orcid.org/0000-0001-5991-8770
                https://orcid.org/0000-0002-4978-5211
                https://orcid.org/0000-0002-1290-0853
                https://orcid.org/0000-0002-0864-8608
                https://orcid.org/0000-0002-7016-7485
                Article
                biology-07-00032
                10.3390/biology7020032
                6022998
                29799487
                b7c5d867-9973-4713-beb0-12d0215114f6
                © 2018 by the authors.

                Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( http://creativecommons.org/licenses/by/4.0/).

                History
                : 03 May 2018
                : 22 May 2018
                Categories
                Article

                inpactor,transposable elements,ltr retrotransposons,parallel programming,pineapple,hpc

                Comments

                Comment on this article