Inviting an author to review:
Find an author and click ‘Invite to review selected article’ near their name.
Search for authorsSearch for similar articles
40
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A draft genome assembly of the solar-powered sea slug Elysia chlorotica

      data-paper

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Elysia chlorotica, a sacoglossan sea slug found off the East Coast of the United States, is well-known for its ability to sequester chloroplasts from its algal prey and survive by photosynthesis for up to 12 months in the absence of food supply. Here we present a draft genome assembly of E. chlorotica that was generated using a hybrid assembly strategy with Illumina short reads and PacBio long reads. The genome assembly comprised 9,989 scaffolds, with a total length of 557 Mb and a scaffold N50 of 442 kb. BUSCO assessment indicated that 93.3% of the expected metazoan genes were completely present in the genome assembly. Annotation of the E. chlorotica genome assembly identified 176 Mb (32.6%) of repetitive sequences and a total of 24,980 protein-coding genes. We anticipate that the annotated draft genome assembly of the E. chlorotica sea slug will promote the investigation of sacoglossan genetics, evolution, and particularly, the genetic signatures accounting for the long-term functioning of algal chloroplasts in an animal.

          Related collections

          Most cited references41

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          SMART: recent updates, new developments and status in 2015

          SMART (Simple Modular Architecture Research Tool) is a web resource (http://smart.embl.de/) providing simple identification and extensive annotation of protein domains and the exploration of protein domain architectures. In the current version, SMART contains manually curated models for more than 1200 protein domains, with ∼200 new models since our last update article. The underlying protein databases were synchronized with UniProt, Ensembl and STRING, bringing the total number of annotated domains and other protein features above 100 million. SMART's ‘Genomic’ mode, which annotates proteins from completely sequenced genomes was greatly expanded and now includes 2031 species, compared to 1133 in the previous release. SMART analysis results pages have been completely redesigned and include links to several new information sources. A new, vector-based display engine has been developed for protein schematics in SMART, which can also be exported as high-resolution bitmap images for easy inclusion into other documents. Taxonomic tree displays in SMART have been significantly improved, and can be easily navigated using the integrated search engine.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            SOAPnuke: a MapReduce acceleration-supported software for integrated quality control and preprocessing of high-throughput sequencing data

            Abstract Quality control (QC) and preprocessing are essential steps for sequencing data analysis to ensure the accuracy of results. However, existing tools cannot provide a satisfying solution with integrated comprehensive functions, proper architectures, and highly scalable acceleration. In this article, we demonstrate SOAPnuke as a tool with abundant functions for a “QC-Preprocess-QC” workflow and MapReduce acceleration framework. Four modules with different preprocessing functions are designed for processing datasets from genomic, small RNA, Digital Gene Expression, and metagenomic experiments, respectively. As a workflow-like tool, SOAPnuke centralizes processing functions into 1 executable and predefines their order to avoid the necessity of reformatting different files when switching tools. Furthermore, the MapReduce framework enables large scalability to distribute all the processing works to an entire compute cluster. We conducted a benchmarking where SOAPnuke and other tools are used to preprocess a ∼30× NA12878 dataset published by GIAB. The standalone operation of SOAPnuke struck a balance between resource occupancy and performance. When accelerated on 16 working nodes with MapReduce, SOAPnuke achieved ∼5.7 times the fastest speed of other tools.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              PROSITE, a protein domain database for functional characterization and annotation

              PROSITE consists of documentation entries describing protein domains, families and functional sites, as well as associated patterns and profiles to identify them. It is complemented by ProRule, a collection of rules based on profiles and patterns, which increases the discriminatory power of these profiles and patterns by providing additional information about functionally and/or structurally critical amino acids. PROSITE is largely used for the annotation of domain features of UniProtKB/Swiss-Prot entries. Among the 983 (DNA-binding) domains, repeats and zinc fingers present in Swiss-Prot (release 57.8 of 22 September 2009), 696 (∼70%) are annotated with PROSITE descriptors using information from ProRule. In order to allow better functional characterization of domains, PROSITE developments focus on subfamily specific profiles and a new profile building method giving more weight to functionally important residues. Here, we describe AMSA, an annotated multiple sequence alignment format used to build a new generation of generalized profiles, the migration of ScanProsite to Vital-IT, a cluster of 633 CPUs, and the adoption of the Distributed Annotation System (DAS) to facilitate PROSITE data integration and interchange with other sources. The latest version of PROSITE (release 20.54, of 22 September 2009) contains 1308 patterns, 863 profiles and 869 ProRules. PROSITE is accessible at: http://www.expasy.org/prosite/.
                Bookmark

                Author and article information

                Journal
                Sci Data
                Sci Data
                Scientific Data
                Nature Publishing Group
                2052-4463
                19 February 2019
                2019
                : 6
                : 190022
                Affiliations
                [1 ]Department of Computer Science, City University of Hong Kong , Hong Kong 999077, China
                [2 ]BGI-Shenzhen , Shenzhen 518083, China
                [3 ]State Key Laboratory of Genetic Resources and Evolution, Kunming Institute of Zoology, Chinese Academy of Sciences , Kunming 650223, China
                [4 ]Center for Excellence in Animal Evolution and Genetics, Chinese Academy of Sciences , 650223, Kunming, China
                [5 ]BGI Genomics, BGI-Shenzhen , Shenzhen 518083, China
                [6 ]Department of Biology, Ave Maria University, Ave Maria , Florida 34142, USA
                [7 ]Section for Evolutionary Genomics, Natural History Museum of Denmark, University of Copenhagen , Copenhagen 1350, Denmark
                [8 ]National Institute for Basic Biology , Okazaki 444-8585, Japan
                [9 ]Department of Integrative Biology, University of South Florida, Tampa , Florida 33620, USA
                [10 ]Graduate University for Advanced Studies (SOKENDAI) , Okazaki 444-8585, Japan
                [11 ]Advanced Science Research Center, Kanazawa University , Kanazawa 920-0934, Japan
                [12 ]James D. Watson Institute of Genome Sciences , Hangzhou 310058, China
                [13 ]Department of Biology, University of Maryland , College Park, Maryland 20742, USA
                Author notes
                [a ]S.L. (email: shuaicli@ 123456cityu.edu.hk )
                [b ]S.K.P. (email: pierce@ 123456usf.edu )
                [c ]J.W. (email: wangjian@ 123456genomics.cn )
                [*]

                These authors contributed equally to this work.

                []

                J.W. and S.P. conceived the project. Q.L. and S.L. supervised the project. N.C. and J.S. collected the sea slug samples and performed DNA extraction. T.S., T.M., S.S., T.N. and M.H. carried out PacBio library construction and sequencing. H.C. performed genome assembly, repeat annotation, gene annotation, and gene function annotation. J.L., M.F. and A.A. conducted the assembly quality assessment and other analyses. H.Y., X.D. and N.L. contributed reagents/materials/analysis tools. Q.L. and A.A. drafted the manuscript. S.P. and N.C. revised the manuscript. All authors read and approved the final manuscript.

                Author information
                http://orcid.org/0000-0002-5945-5902
                http://orcid.org/0000-0001-7061-3337
                http://orcid.org/0000-0001-5258-8043
                http://orcid.org/0000-0003-4185-0135
                http://orcid.org/0000-0003-1279-7806
                http://orcid.org/0000-0001-7425-8758
                Article
                sdata201922
                10.1038/sdata.2019.22
                6380222
                30778257
                98e3000b-efba-4210-9e75-658ce1443a14
                Copyright © 2019, The Author(s)

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/ The Creative Commons Public Domain Dedication waiver http://creativecommons.org/publicdomain/zero/1.0/ applies to the metadata files made available in this article.

                History
                : 13 September 2018
                : 10 January 2019
                Categories
                Data Descriptor

                genome,dna sequencing,marine biology,genomics,evolution
                genome, dna sequencing, marine biology, genomics, evolution

                Comments

                Comment on this article

                scite_
                0
                0
                0
                0
                Smart Citations
                0
                0
                0
                0
                Citing PublicationsSupportingMentioningContrasting
                View Citations

                See how this article has been cited at scite.ai

                scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

                Similar content594

                Cited by26

                Most referenced authors2,347