3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Identification of Macrocyclic Peptide Families from Combinatorial Libraries Containing Noncanonical Amino Acids Using Cheminformatics and Bioinformatics Inspired Clustering

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In the past decade, macrocyclic peptides gained increasing interest as a new therapeutic modality to tackle intracellular and extracellular therapeutic targets that had been previously classified as “undruggable”. Several technological advances have made discovering macrocyclic peptides against these targets possible: 1) the inclusion of noncanonical amino acids (NCAAs) into mRNA display, 2) increased availability of next generation sequencing (NGS), and 3) improvements in rapid peptide synthesis platforms. This type of directed-evolution based screening can produce large numbers of potential hit sequences given that DNA sequencing is the functional output of this platform. The current standard for selecting hit peptides from these selections for downstream follow-up relies on the frequency counting and sorting of unique peptide sequences which can result in the generation of false negatives due to technical reasons including low translation efficiency or other experimental factors. To overcome our inability to detect weakly enriched peptide sequences among our large data sets, we wanted to develop a clustering method that would enable the identification of peptide families. Unfortunately, utilizing traditional clustering algorithms, such as ClustalW, is not possible for this technology due to the incorporation of NCAAs in these libraries. Therefore, we developed a new atomistic clustering method with a Pairwise Aligned Peptide (PAP) chemical similarity metric to perform sequence alignments and identify macrocyclic peptide families. With this method, low enriched peptides, including isolated sequences (singletons), can now be clustered into families providing a comprehensive analysis of NGS data resulting from macrocycle discovery selections. Additionally, upon identification of a hit peptide with the desired activity, this clustering algorithm can be used to identify derivatives from the initial data set for structure–activity relationship (SAR) analysis without requiring additional selection experiments.

          Related collections

          Most cited references24

          • Record: found
          • Abstract: found
          • Article: not found

          CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice.

          The sensitivity of the commonly used progressive multiple sequence alignment method has been greatly improved for the alignment of divergent protein sequences. Firstly, individual weights are assigned to each sequence in a partial alignment in order to down-weight near-duplicate sequences and up-weight the most divergent ones. Secondly, amino acid substitution matrices are varied at different alignment stages according to the divergence of the sequences to be aligned. Thirdly, residue-specific gap penalties and locally reduced gap penalties in hydrophilic regions encourage new gaps in potential loop regions rather than regular secondary structure. Fourthly, positions in early alignments where gaps have been opened receive locally reduced gap penalties to encourage the opening up of new gaps at these positions. These modifications are incorporated into a new program, CLUSTAL W which is freely available.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Extended-connectivity fingerprints.

            Extended-connectivity fingerprints (ECFPs) are a novel class of topological fingerprints for molecular characterization. Historically, topological fingerprints were developed for substructure and similarity searching. ECFPs were developed specifically for structure-activity modeling. ECFPs are circular fingerprints with a number of useful qualities: they can be very rapidly calculated; they are not predefined and can represent an essentially infinite number of different molecular features (including stereochemical information); their features represent the presence of particular substructures, allowing easier interpretation of analysis results; and the ECFP algorithm can be tailored to generate different types of circular fingerprints, optimized for different uses. While the use of ECFPs has been widely adopted and validated, a description of their implementation has not previously been presented in the literature.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              A general method applicable to the search for similarities in the amino acid sequence of two proteins

                Bookmark

                Author and article information

                Journal
                ACS Chem Biol
                ACS Chem Biol
                cb
                acbcct
                ACS Chemical Biology
                American Chemical Society
                1554-8929
                1554-8937
                23 May 2023
                16 June 2023
                : 18
                : 6
                : 1425-1434
                Affiliations
                []Discovery Chemistry, Genentech Inc. , 1 DNA Way, South San Francisco, California 94080, United States
                []Peptide Therapeutics, Genentech Inc. , 1 DNA Way, South San Francisco, California 94080, United States
                [§ ]Biological Chemistry, Genentech Inc. 1 DNA Way, South San Francisco, California 94080, United States
                Author notes
                Author information
                https://orcid.org/0000-0002-8958-4502
                https://orcid.org/0000-0003-0434-9237
                https://orcid.org/0000-0003-3993-660X
                Article
                10.1021/acschembio.3c00159
                10278063
                37220419
                5fd904f3-48d0-4e20-af11-c8be9188b9db
                © 2023 The Authors. Published by American Chemical Society

                Permits non-commercial access and re-use, provided that author attribution and integrity are maintained; but does not permit creation of adaptations or other derivative works ( https://creativecommons.org/licenses/by-nc-nd/4.0/).

                History
                : 16 March 2023
                : 10 May 2023
                Categories
                Articles
                Custom metadata
                cb3c00159
                cb3c00159

                Biochemistry
                Biochemistry

                Comments

                Comment on this article