44
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Design of metalloproteins and novel protein folds using variational autoencoders

      research-article
      1 , 2 , 1 , 2 , 1 , 2 ,
      Scientific Reports
      Nature Publishing Group UK

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The design of novel proteins has many applications but remains an attritional process with success in isolated cases. Meanwhile, deep learning technologies have exploded in popularity in recent years and are increasingly applicable to biology due to the rise in available data. We attempt to link protein design and deep learning by using variational autoencoders to generate protein sequences conditioned on desired properties. Potential copper and calcium binding sites are added to non-metal binding proteins without human intervention and compared to a hidden Markov model. In another use case, a grammar of protein structures is developed and used to produce sequences for a novel protein topology. One candidate structure is found to be stable by molecular dynamics simulation. The ability of our model to confine the vast search space of protein sequences and to scale easily has the potential to assist in a variety of protein design tasks.

          Related collections

          Most cited references35

          • Record: found
          • Abstract: found
          • Article: not found

          ROSETTA3: an object-oriented software suite for the simulation and design of macromolecules.

          We have recently completed a full re-architecturing of the ROSETTA molecular modeling program, generalizing and expanding its existing functionality. The new architecture enables the rapid prototyping of novel protocols by providing easy-to-use interfaces to powerful tools for molecular modeling. The source code of this rearchitecturing has been released as ROSETTA3 and is freely available for academic use. At the time of its release, it contained 470,000 lines of code. Counting currently unpublished protocols at the time of this writing, the source includes 1,285,000 lines. Its rapid growth is a testament to its ease of use. This chapter describes the requirements for our new architecture, justifies the design decisions, sketches out central classes, and highlights a few of the common tasks that the new software can perform. © 2011 Elsevier Inc. All rights reserved.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Scalable web services for the PSIPRED Protein Analysis Workbench

            Here, we present the new UCL Bioinformatics Group’s PSIPRED Protein Analysis Workbench. The Workbench unites all of our previously available analysis methods into a single web-based framework. The new web portal provides a greatly streamlined user interface with a number of new features to allow users to better explore their results. We offer a number of additional services to enable computationally scalable execution of our prediction methods; these include SOAP and XML-RPC web server access and new HADOOP packages. All software and services are available via the UCL Bioinformatics Group website at http://bioinf.cs.ucl.ac.uk/.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              How fast-folding proteins fold.

              An outstanding challenge in the field of molecular biology has been to understand the process by which proteins fold into their characteristic three-dimensional structures. Here, we report the results of atomic-level molecular dynamics simulations, over periods ranging between 100 μs and 1 ms, that reveal a set of common principles underlying the folding of 12 structurally diverse proteins. In simulations conducted with a single physics-based energy function, the proteins, representing all three major structural classes, spontaneously and repeatedly fold to their experimentally determined native structures. Early in the folding process, the protein backbone adopts a nativelike topology while certain secondary structure elements and a small number of nonlocal contacts form. In most cases, folding follows a single dominant route in which elements of the native structure appear in an order highly correlated with their propensity to form in the unfolded state.
                Bookmark

                Author and article information

                Contributors
                d.t.jones@ucl.ac.uk
                Journal
                Sci Rep
                Sci Rep
                Scientific Reports
                Nature Publishing Group UK (London )
                2045-2322
                1 November 2018
                1 November 2018
                2018
                : 8
                : 16189
                Affiliations
                [1 ]ISNI 0000000121901201, GRID grid.83440.3b, Department of Computer Science, , University College London, ; Gower Street, London, WC1E 6BT UK
                [2 ]ISNI 0000 0004 1795 1830, GRID grid.451388.3, Francis Crick Institute, ; 1 Midland Road, London, NW1 1AT UK
                Article
                34533
                10.1038/s41598-018-34533-1
                6212568
                30385875
                707ad2b5-3ead-47f0-a858-caf5bc929b42
                © The Author(s) 2018

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 28 June 2018
                : 19 October 2018
                Funding
                Funded by: FundRef https://doi.org/10.13039/501100000781, EC | European Research Council (ERC);
                Award ID: 695558
                Award ID: 695558
                Award Recipient :
                Funded by: FundRef https://doi.org/10.13039/501100000268, Biotechnology and Biological Sciences Research Council (BBSRC);
                Award ID: BB/M011712/1
                Award ID: BB/M011712/1
                Award Recipient :
                Categories
                Article
                Custom metadata
                © The Author(s) 2018

                Uncategorized
                Uncategorized

                Comments

                Comment on this article

                scite_
                0
                0
                0
                0
                Smart Citations
                0
                0
                0
                0
                Citing PublicationsSupportingMentioningContrasting
                View Citations

                See how this article has been cited at scite.ai

                scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

                Similar content190

                Cited by44

                Most referenced authors1,983