6
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      AllerCatPro—prediction of protein allergenicity potential from the protein sequence

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation

          Due to the risk of inducing an immediate Type I (IgE-mediated) allergic response, proteins intended for use in consumer products must be investigated for their allergenic potential before introduction into the marketplace. The FAO/WHO guidelines for computational assessment of allergenic potential of proteins based on short peptide hits and linear sequence window identity thresholds misclassify many proteins as allergens.

          Results

          We developed AllerCatPro which predicts the allergenic potential of proteins based on similarity of their 3D protein structure as well as their amino acid sequence compared with a data set of known protein allergens comprising of 4180 unique allergenic protein sequences derived from the union of the major databases Food Allergy Research and Resource Program, Comprehensive Protein Allergen Resource, WHO/International Union of Immunological Societies, UniProtKB and Allergome. We extended the hexamer hit rule by removing peptides with high probability of random occurrence measured by sequence entropy as well as requiring 3 or more hexamer hits consistent with natural linear epitope patterns in known allergens. This is complemented with a Gluten-like repeat pattern detection. We also switched from a linear sequence window similarity to a B-cell epitope-like 3D surface similarity window which became possible through extensive 3D structure modeling covering the majority (74%) of allergens. In case no structure similarity is found, the decision workflow reverts to the old linear sequence window rule. The overall accuracy of AllerCatPro is 84% compared with other current methods which range from 51 to 73%. Both the FAO/WHO rules and AllerCatPro achieve highest sensitivity but AllerCatPro provides a 37-fold increase in specificity.

          Availability and implementation

          https://allercatpro.bii.a-star.edu.sg/

          Supplementary information

          Supplementary data are available at Bioinformatics online.

          Related collections

          Most cited references30

          • Record: found
          • Abstract: found
          • Article: not found

          Protein Data Bank (PDB): The Single Global Macromolecular Structure Archive.

          The Protein Data Bank (PDB)--the single global repository of experimentally determined 3D structures of biological macromolecules and their complexes--was established in 1971, becoming the first open-access digital resource in the biological sciences. The PDB archive currently houses ~130,000 entries (May 2017). It is managed by the Worldwide Protein Data Bank organization (wwPDB; wwpdb.org), which includes the RCSB Protein Data Bank (RCSB PDB; rcsb.org), the Protein Data Bank Japan (PDBj; pdbj.org), the Protein Data Bank in Europe (PDBe; pdbe.org), and BioMagResBank (BMRB; www.bmrb.wisc.edu). The four wwPDB partners operate a unified global software system that enforces community-agreed data standards and supports data Deposition, Biocuration, and Validation of ~11,000 new PDB entries annually (deposit.wwpdb.org). The RCSB PDB currently acts as the archive keeper, ensuring disaster recovery of PDB data and coordinating weekly updates. wwPDB partners disseminate the same archival data from multiple FTP sites, while operating complementary websites that provide their own views of PDB data with selected value-added information and links to related data resources. At present, the PDB archives experimental data, associated metadata, and 3D-atomic level structural models derived from three well-established methods: crystallography, nuclear magnetic resonance spectroscopy (NMR), and electron microscopy (3DEM). wwPDB partners are working closely with experts in related experimental areas (small-angle scattering, chemical cross-linking/mass spectrometry, Forster energy resonance transfer or FRET, etc.) to establish a federation of data resources that will support sustainable archiving and validation of 3D structural models and experimental data derived from integrative or hybrid methods.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Analysis of compositionally biased regions in sequence databases.

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The Tudor domain 'Royal Family': Tudor, plant Agenet, Chromo, PWWP and MBT domains.

              We have identified a family of 'Agenet' domains that are plant-specific homologs of Tudor domains. This finding has been extended, using a combination of sequence- and structure-dependent approaches, to show that the three beta-stranded core regions of Tudor, PWWP, chromatin-binding (Chromo) and MBT domains are homologous because they originate from a common ancestor. In addition, we have revealed pairs of tandem repeats in the fragile X mental retardation protein (FMRP) family that are also members of this Tudor domain 'Royal Family'.
                Bookmark

                Author and article information

                Contributors
                Role: Associate Editor
                Journal
                Bioinformatics
                Bioinformatics
                bioinformatics
                Bioinformatics
                Oxford University Press
                1367-4803
                1367-4811
                01 September 2019
                18 January 2019
                18 January 2019
                : 35
                : 17
                : 3020-3027
                Affiliations
                [1 ] Biomolecular Function Discovery Division, Bioinformatics Institute, Agency for Science, Technology and Research , Singapore
                [2 ]Department of Biological Sciences, National University of Singapore , Singapore
                [3 ] The Procter & Gamble Services Company, Strombeek-Bever , Belgium
                [4 ] The Procter and Gamble Company , Mason, OH, USA
                Author notes
                To whom correspondence should be addressed. E-mail: sebastianms@ 123456bii.a-star.edu.sg
                Present address: GF3 Consultancy, West Chester, OH, USA
                Article
                btz029
                10.1093/bioinformatics/btz029
                6736023
                30657872
                4bde6641-631f-4a9f-b5e7-bcd5a7606730
                © The Author(s) 2019. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 30 August 2018
                : 18 December 2018
                : 14 January 2019
                Page count
                Pages: 8
                Funding
                Funded by: Agency of Science, Technology and Research
                Funded by: A*STAR 10.13039/501100001348
                Funded by: Procter & Gamble
                Categories
                Original Papers
                Structural Bioinformatics

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article