33
views
0
recommends
+1 Recommend
2 collections
    0
    shares

      Why publish your research Open Access with G3: Genes|Genomes|Genetics?

      Learn more and submit today!

      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      De Novo Genome Sequence Assembly of Dwarf Coconut ( Cocos nucifera L. ‘Catigan Green Dwarf’) Provides Insights into Genomic Variation Between Coconut Types and Related Palm Species

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We report the first whole genome sequence (WGS) assembly and annotation of a dwarf coconut variety, ‘Catigan Green Dwarf’ (CATD). The genome sequence was generated using the PacBio SMRT sequencing platform at 15X coverage of the expected genome size of 2.15 Gbp, which was corrected with assembled 50X Illumina paired-end MiSeq reads of the same genome. The draft genome was improved through Chicago sequencing to generate a scaffold assembly that results in a total genome size of 2.1 Gbp consisting of 7,998 scaffolds with N50 of 570,487 bp. The final assembly covers around 97.6% of the estimated genome size of coconut ‘CATD’ based on homozygous k-mer peak analysis. A total of 34,958 high-confidence gene models were predicted and functionally associated to various economically important traits, such as pest/disease resistance, drought tolerance, coconut oil biosynthesis, and putative transcription factors. The assembled genome was used to infer the evolutionary relationship within the palm family based on genomic variations and synteny of coding gene sequences. Data show that at least three (3) rounds of whole genome duplication occurred and are commonly shared by these members of the Arecaceae family. A total of 7,139 unique SSR markers were designed to be used as a resource in marker-based breeding. In addition, we discovered 58,503 variants in coconut by aligning the Hainan Tall (HAT) WGS reads to the non-repetitive regions of the assembled CATD genome. The gene markers and genome-wide SSR markers established here will facilitate the development of varieties with resilience to climate change, resistance to pests and diseases, and improved oil yield and quality.

          Related collections

          Most cited references95

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          LTRharvest, an efficient and flexible software for de novo detection of LTR retrotransposons

          Background Transposable elements are abundant in eukaryotic genomes and it is believed that they have a significant impact on the evolution of gene and chromosome structure. While there are several completed eukaryotic genome projects, there are only few high quality genome wide annotations of transposable elements. Therefore, there is a considerable demand for computational identification of transposable elements. LTR retrotransposons, an important subclass of transposable elements, are well suited for computational identification, as they contain long terminal repeats (LTRs). Results We have developed a software tool LTRharvest for the de novo detection of full length LTR retrotransposons in large sequence sets. LTRharvest efficiently delivers high quality annotations based on known LTR transposon features like length, distance, and sequence motifs. A quality validation of LTRharvest against a gold standard annotation for Saccharomyces cerevisae and Drosophila melanogaster shows a sensitivity of up to 90% and 97% and specificity of 100% and 72%, respectively. This is comparable or slightly better than annotations for previous software tools. The main advantage of LTRharvest over previous tools is (a) its ability to efficiently handle large datasets from finished or unfinished genome projects, (b) its flexibility in incorporating known sequence features into the prediction, and (c) its availability as an open source software. Conclusion LTRharvest is an efficient software tool delivering high quality annotation of LTR retrotransposons. It can, for example, process the largest human chromosome in approx. 8 minutes on a Linux PC with 4 GB of memory. Its flexibility and small space and run-time requirements makes LTRharvest a very competitive candidate for future LTR retrotransposon annotation projects. Moreover, the structured design and implementation and the availability as open source provides an excellent base for incorporating novel concepts to further improve prediction of LTR retrotransposons.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            LTR_retriever: a highly accurate and sensitive program for identification of long terminal-repeat retrotransposons

            Long terminal repeat retrotransposons (LTR-RTs) are prevalent in plant genomes. The identification of LTR-RTs is critical for achieving high-quality gene annotation. Based on the well-conserved structure, multiple programs were developed for the de novo identification of LTR-RTs; however, these programs are associated with low specificity and high false discovery rates. Here, we report LTR_retriever, a multithreading-empowered Perl program that identifies LTR-RTs and generates high-quality LTR libraries from genomic sequences. LTR_retriever demonstrated significant improvements by achieving high levels of sensitivity (91%), specificity (97%), accuracy (96%), and precision (90%) in rice (Oryza sativa). LTR_retriever is also compatible with long sequencing reads. With 40k self-corrected PacBio reads equivalent to 4.5× genome coverage in Arabidopsis (Arabidopsis thaliana), the constructed LTR library showed excellent sensitivity and specificity. In addition to canonical LTR-RTs with 5'-TG…CA-3' termini, LTR_retriever also identifies noncanonical LTR-RTs (non-TGCA), which have been largely ignored in genome-wide studies. We identified seven types of noncanonical LTRs from 42 out of 50 plant genomes. The majority of noncanonical LTRs are Copia elements, with which the LTR is four times shorter than that of other Copia elements, which may be a result of their target specificity. Strikingly, non-TGCA Copia elements are often located in genic regions and preferentially insert nearby or within genes, indicating their impact on the evolution of genes and their potential as mutagenesis tools.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Monitoring the expression profiles of 7000 Arabidopsis genes under drought, cold and high-salinity stresses using a full-length cDNA microarray.

              Full-length cDNAs are essential for functional analysis of plant genes in the post-sequencing era of the Arabidopsis genome. Recently, cDNA microarray analysis has been developed for quantitative analysis of global and simultaneous analysis of expression profiles. We have prepared a full-length cDNA microarray containing approximately 7000 independent, full-length cDNA groups to analyse the expression profiles of genes under drought, cold (low temperature) and high-salinity stress conditions over time. The transcripts of 53, 277 and 194 genes increased after cold, drought and high-salinity treatments, respectively, more than fivefold compared with the control genes. We also identified many highly drought-, cold- or high-salinity- stress-inducible genes. However, we observed strong relationships in the expression of these stress-responsive genes based on Venn diagram analysis, and found 22 stress-inducible genes that responded to all three stresses. Several gene groups showing different expression profiles were identified by analysis of their expression patterns during stress-responsive gene induction. The cold-inducible genes were classified into at least two gene groups from their expression profiles. DREB1A was included in a group whose expression peaked at 2 h after cold treatment. Among the drought, cold or high-salinity stress-inducible genes identified, we found 40 transcription factor genes (corresponding to approximately 11% of all stress-inducible genes identified), suggesting that various transcriptional regulatory mechanisms function in the drought, cold or high-salinity stress signal transduction pathways.
                Bookmark

                Author and article information

                Journal
                G3 (Bethesda)
                Genetics
                G3: Genes, Genomes, Genetics
                G3: Genes, Genomes, Genetics
                G3: Genes, Genomes, Genetics
                G3: Genes|Genomes|Genetics
                Genetics Society of America
                2160-1836
                5 June 2019
                August 2019
                : 9
                : 8
                : 2377-2393
                Affiliations
                [* ]Genetics Laboratory, Institute of Plant Breeding, College of Agriculture and Food Science, University of the Philippines Los Baños, College, Laguna, Philippines 4031
                []Philippine Genome Center, University of the Philippines System, Diliman, Quezon City, Philippines
                []Boyce Thompson Institute, Ithaca, New York 14853, and
                [§ ]Institute of Crop Science, College of Agriculture and Food Science, University of the Philippines Los Baños, College, Laguna, Philippines 4031
                Author notes
                Author information
                http://orcid.org/0000-0002-0121-0048
                http://orcid.org/0000-0001-8232-7661
                http://orcid.org/0000-0001-8640-1750
                Article
                GGG_400215
                10.1534/g3.119.400215
                6686914
                31167834
                c4d4deb6-f5c1-4192-9e40-a865e410b76e
                Copyright © 2019 Lantican et al.

                This is an open-access article distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 31 March 2019
                : 31 May 2019
                Page count
                Figures: 5, Tables: 2, Equations: 0, References: 147, Pages: 17
                Categories
                Genome Report

                Genetics
                cocos nucifera l.,dwarf coconut,genome assembly,illumina miseq sequencing,pacbio smrt sequencing,dovetail chicago sequencing,hybrid assembly,ssr and snp markers

                Comments

                Comment on this article