1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      DNA Data Bank of Japan (DDBJ) update report 2022

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The Bioinformation and DNA Data Bank of Japan (DDBJ) Center ( https://www.ddbj.nig.ac.jp) maintains database archives that cover a wide range of fields in life sciences. As a founding member of the International Nucleotide Sequence Database Collaboration (INSDC), our primary mission is to collect and distribute nucleotide sequence data, as well as their study and sample information, in collaboration with the National Center for Biotechnology Information in the United States and the European Bioinformatics Institute. In addition to INSDC resources, the Center operates databases for functional genomics (GEA: Genomic Expression Archive), metabolomics (MetaboBank), and human genetic and phenotypic data (JGA: Japanese Genotype–Phenotype Archive). These databases are built on the supercomputer of the National Institute of Genetics, whose remaining computational capacity is actively utilized by domestic researchers for large-scale biological data analyses. Here, we report our recent updates and the activities of our services.

          Related collections

          Most cited references22

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Highly accurate protein structure prediction with AlphaFold

          Proteins are essential to life, and understanding their structure can facilitate a mechanistic understanding of their function. Through an enormous experimental effort 1 – 4 , the structures of around 100,000 unique proteins have been determined 5 , but this represents a small fraction of the billions of known protein sequences 6 , 7 . Structural coverage is bottlenecked by the months to years of painstaking effort required to determine a single protein structure. Accurate computational approaches are needed to address this gap and to enable large-scale structural bioinformatics. Predicting the three-dimensional structure that a protein will adopt based solely on its amino acid sequence—the structure prediction component of the ‘protein folding problem’ 8 —has been an important open research problem for more than 50 years 9 . Despite recent progress 10 – 14 , existing methods fall far short of atomic accuracy, especially when no homologous structure is available. Here we provide the first computational method that can regularly predict protein structures with atomic accuracy even in cases in which no similar structure is known. We validated an entirely redesigned version of our neural network-based model, AlphaFold, in the challenging 14th Critical Assessment of protein Structure Prediction (CASP14) 15 , demonstrating accuracy competitive with experimental structures in a majority of cases and greatly outperforming other methods. Underpinning the latest version of AlphaFold is a novel machine learning approach that incorporates physical and biological knowledge about protein structure, leveraging multi-sequence alignments, into the design of the deep learning algorithm. AlphaFold predicts protein structures with an accuracy competitive with experimental structures in the majority of cases using a novel deep learning architecture.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            DFAST: a flexible prokaryotic genome annotation pipeline for faster genome publication

            Abstract Summary We developed a prokaryotic genome annotation pipeline, DFAST, that also supports genome submission to public sequence databases. DFAST was originally started as an on-line annotation server, and to date, over 7000 jobs have been processed since its first launch in 2016. Here, we present a newly implemented background annotation engine for DFAST, which is also available as a standalone command-line program. The new engine can annotate a typical-sized bacterial genome within 10 min, with rich information such as pseudogenes, translation exceptions and orthologous gene assignment between given reference genomes. In addition, the modular framework of DFAST allows users to customize the annotation workflow easily and will also facilitate extensions for new functions and incorporation of new tools in the future. Availability and implementation The software is implemented in Python 3 and runs in both Python 2.7 and 3.4—on Macintosh and Linux systems. It is freely available at https://github.com/nigyta/dfast_core/under the GPLv3 license with external binaries bundled in the software distribution. An on-line version is also available at https://dfast.nig.ac.jp/. Contact yn@nig.ac.jp Supplementary information Supplementary data are available at Bioinformatics online.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The Gene Expression Omnibus Database.

              The Gene Expression Omnibus (GEO) database is an international public repository that archives and freely distributes high-throughput gene expression and other functional genomics data sets. Created in 2000 as a worldwide resource for gene expression studies, GEO has evolved with rapidly changing technologies and now accepts high-throughput data for many other data applications, including those that examine genome methylation, chromatin structure, and genome-protein interactions. GEO supports community-derived reporting standards that specify provision of several critical study elements including raw data, processed data, and descriptive metadata. The database not only provides access to data for tens of thousands of studies, but also offers various Web-based tools and strategies that enable users to locate data relevant to their specific interests, as well as to visualize and analyze the data. This chapter includes detailed descriptions of methods to query and download GEO data and use the analysis and visualization tools. The GEO homepage is at http://www.ncbi.nlm.nih.gov/geo/.
                Bookmark

                Author and article information

                Contributors
                Journal
                Nucleic Acids Res
                Nucleic Acids Res
                nar
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                06 January 2023
                24 November 2022
                24 November 2022
                : 51
                : D1
                : D101-D105
                Affiliations
                Bioinformation and DDBJ Center, National Institute of Genetics , Mishima, Shizuoka 411-8540, Japan
                Bioinformation and DDBJ Center, National Institute of Genetics , Mishima, Shizuoka 411-8540, Japan
                Bioinformation and DDBJ Center, National Institute of Genetics , Mishima, Shizuoka 411-8540, Japan
                Bioinformation and DDBJ Center, National Institute of Genetics , Mishima, Shizuoka 411-8540, Japan
                Bioinformation and DDBJ Center, National Institute of Genetics , Mishima, Shizuoka 411-8540, Japan
                Bioinformation and DDBJ Center, National Institute of Genetics , Mishima, Shizuoka 411-8540, Japan
                Bioinformation and DDBJ Center, National Institute of Genetics , Mishima, Shizuoka 411-8540, Japan
                Author notes
                To whom correspondence should be addressed. Tel: +55 981 6859; Fax: +55 981 6889; Email: ytanizaw@ 123456nig.ac.jp
                Author information
                https://orcid.org/0000-0002-6294-3309
                https://orcid.org/0000-0001-7691-9812
                https://orcid.org/0000-0003-4138-1893
                https://orcid.org/0000-0002-6782-5715
                Article
                gkac1083
                10.1093/nar/gkac1083
                9825463
                36420889
                2591404b-9b5d-4b14-9707-2a0986a72fa2
                © The Author(s) 2022. Published by Oxford University Press on behalf of Nucleic Acids Research.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 22 November 2022
                : 24 October 2022
                : 07 October 2022
                Page count
                Pages: 5
                Funding
                Funded by: Ministry of Education, Culture, Sports, Science and Technology, DOI 10.13039/501100001700;
                Funded by: Japan Science and Technology Agency, DOI 10.13039/501100002241;
                Award ID: JPMJCR1501
                Funded by: Database Integration Coordination Program of NBDC for MetaboBank;
                Funded by: AMED, DOI 10.13039/100009619;
                Award ID: 20gm1010006h0004
                Categories
                AcademicSubjects/SCI00010
                Database Issue

                Genetics
                Genetics

                Comments

                Comment on this article