Inviting an author to review:
Find an author and click ‘Invite to review selected article’ near their name.
Search for authorsSearch for similar articles
4
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Pairwise sequence similarity mapping with PaSiMap: Reclassification of immunoglobulin domains from titin as case study

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Graphical abstract

          Highlights

          • A novel multidimensional scaling pipeline for sequence analysis.

          • A simple way to distinguish between unique and shared sequence features.

          • Titin domains were reclassified, improving upon earlier analysis.

          Abstract

          Sequence comparison is critical for the functional assignment of newly identified protein genes. As uncharacterized protein sequences accumulate, there is an increasing need for sensitive tools for their classification. Here, we present a novel multidimensional scaling pipeline, PaSiMap, which creates a map of pairwise sequence similarities. Uniquely, PaSiMap distinguishes between unique and shared features, allowing for a distinct view of protein-sequence relationships. We demonstrate PaSiMap’s efficiency in detecting sequence groups and outliers using titin’s 169 immunoglobulin (Ig) domains. We show that Ig domain similarity is hierarchical, being firstly determined by chain location, then by the loop features of the Ig fold and, finally, by super-repeat position. The existence of a previously unidentified domain repeat in the distal, constitutive I-band is revealed. Prototypic Igs, plus notable outliers, are identified and thereby domain classification improved. This re-classification can now guide future molecular research. In summary, we demonstrate that PaSiMap is a sensitive tool for the classification of protein sequences, which adds a new perspective in the understanding of inter-protein relationships. PaSiMap is applicable to any biological system defined by a linear sequence, including polynucleotide chains.

          Related collections

          Most cited references48

          • Record: found
          • Abstract: not found
          • Article: not found

          Matplotlib: A 2D Graphics Environment

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Amino acid substitution matrices from protein blocks.

            Methods for alignment of protein sequences typically measure similarity by using a substitution matrix with scores for all possible exchanges of one amino acid with another. The most widely used matrices are based on the Dayhoff model of evolutionary rates. Using a different approach, we have derived substitution matrices from about 2000 blocks of aligned sequence segments characterizing more than 500 groups of related proteins. This led to marked improvements in alignments and in searches using queries from each of the groups.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              EMBOSS: the European Molecular Biology Open Software Suite.

                Bookmark

                Author and article information

                Contributors
                Journal
                Comput Struct Biotechnol J
                Comput Struct Biotechnol J
                Computational and Structural Biotechnology Journal
                Research Network of Computational and Structural Biotechnology
                2001-0370
                26 September 2022
                2022
                26 September 2022
                : 20
                : 5409-5419
                Affiliations
                Department of Biology, Universität Konstanz, Konstanz, Baden Württemberg 78456, Germany
                Author notes
                Article
                S2001-0370(22)00439-1
                10.1016/j.csbj.2022.09.034
                9529554
                36212532
                322b7840-93b8-47fb-9986-1fdde9731ece
                © 2022 The Authors

                This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

                History
                : 24 May 2022
                : 22 September 2022
                : 22 September 2022
                Categories
                Research Article

                titin,multidimensional scaling,sequence analysis
                titin, multidimensional scaling, sequence analysis

                Comments

                Comment on this article