16
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      PROTEOFORMER 2.0: Further Developments in the Ribosome Profiling-assisted Proteogenomic Hunt for New Proteoforms*

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The PROTEOFORMER pipeline feeds ribosome profiling-driven information into an MS/MS search space. The pipeline has been greatly expanded and updated since its first publication. These novelties are presented and validated with matching MS/MS data, leading to the endorsement of a set of new proteoforms on MS/MS level and to a collection of general considerations for the ribosome profiling-based proteogenomics community.

          Graphical Abstract

          Highlights

          • PROTEOFORMER adds ribosome profiling information to MS/MS search spaces.

          • The PROTEOFORMER pipeline is greatly expanded and updated since its first publication.

          • New features are demonstrated with matching ribosome profiling and MS/MS data.

          • Experiments lead to MS/MS-proven proteoforms and general proteogenomic notices.

          Abstract

          PROTEOFORMER is a pipeline that enables the automated processing of data derived from ribosome profiling (RIBO-seq, i.e. the sequencing of ribosome-protected mRNA fragments). As such, genome-wide ribosome occupancies lead to the delineation of data-specific translation product candidates and these can improve the mass spectrometry-based identification. Since its first publication, different upgrades, new features and extensions have been added to the PROTEOFORMER pipeline. Some of the most important upgrades include P-site offset calculation during mapping, comprehensive data pre-exploration, the introduction of two alternative proteoform calling strategies and extended pipeline output features. These novelties are illustrated by analyzing ribosome profiling data of human HCT116 and Jurkat data. The different proteoform calling strategies are used alongside one another and in the end combined together with reference sequences from UniProt. Matching mass spectrometry data are searched against this extended search space with MaxQuant. Overall, besides annotated proteoforms, this pipeline leads to the identification and validation of different categories of new proteoforms, including translation products of up- and downstream open reading frames, 5′ and 3′ extended and truncated proteoforms, single amino acid variants, splice variants and translation products of so-called noncoding regions. Further, proof-of-concept is reported for the improvement of spectrum matching by including Prosit, a deep neural network strategy that adds extra fragmentation spectrum intensity features to the analysis. In the light of ribosome profiling-driven proteogenomics, it is shown that this allows validating the spectrum matches of newly identified proteoforms with elevated stringency. These updates and novel conclusions provide new insights and lessons for the ribosome profiling-based proteogenomic research field. More practical information on the pipeline, raw code, the user manual (README) and explanations on the different modes of availability can be found at the GitHub repository of PROTEOFORMER: https://github.com/Biobix/proteoformer.

          Related collections

          Most cited references48

          • Record: found
          • Abstract: not found
          • Article: not found

          Bioconda: sustainable and comprehensive software distribution for the life sciences

            Bookmark
            • Record: found
            • Abstract: not found
            • Book: not found

            FastQC: a quality-control tool for high-throughput sequence data.

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Identification of small ORFs in vertebrates using ribosome footprinting and evolutionary conservation.

              Identification of the coding elements in the genome is a fundamental step to understanding the building blocks of living systems. Short peptides (< 100 aa) have emerged as important regulators of development and physiology, but their identification has been limited by their size. We have leveraged the periodicity of ribosome movement on the mRNA to define actively translated ORFs by ribosome footprinting. This approach identifies several hundred translated small ORFs in zebrafish and human. Computational prediction of small ORFs from codon conservation patterns corroborates and extends these findings and identifies conserved sequences in zebrafish and human, suggesting functional peptide products (micropeptides). These results identify micropeptide-encoding genes in vertebrates, providing an entry point to define their function in vivo.
                Bookmark

                Author and article information

                Journal
                Mol Cell Proteomics
                Mol. Cell Proteomics
                mcprot
                mcprot
                MCP
                Molecular & Cellular Proteomics : MCP
                The American Society for Biochemistry and Molecular Biology
                1535-9476
                1535-9484
                9 August 2019
                30 April 2019
                30 April 2019
                : 18
                : 8 Suppl 1
                : S126-S140
                Affiliations
                [1]‡BioBix, Lab of Bioinformatics and Computational Genomics, Department of Mathematical Modeling, Statistics and Bioinformatics, Faculty of Bioscience Engineering, Ghent University, Ghent, Belgium
                [2]§VIB-UGent Center for Medical Biotechnology, Ghent, Belgium
                [3]¶Chair of Proteomics and Bioanalytics, Technical University of Munich, Munich, Germany
                [4]‖SAP SE, Potsdam, Germany
                [5]**Department of Biochemistry and Microbiology, Faculty of Sciences, Ghent University, Ghent, Belgium
                Author notes
                ‡‡ To whom correspondence may be addressed. Tel.: +32 9 /264 99 22; E-mail: Steven.Verbruggen@ 123456UGent.be .
                §§ To whom correspondence may be addressed. E-mail: Gerben.Menschaert@ 123456UGent.be .

                Author contributions: S.V. and G.M. designed the research; S.V., E.N., and G.M. implemented new features for the PROTEOFORMER pipeline; S.V. analyzed ribosome profiling and proteomics data; S.G. and M.W. calculated extra PSM features with Prosit; P.V.D. performed proteome analyses; S.V., S.G., M.W., P.V.D., and G.M. wrote the paper; G.M. supervised the research and P.V.D., W.V.C., and B.K. advised on research. All authors read and approved the final manuscript.

                Author information
                https://orcid.org/0000-0001-9441-9539
                https://orcid.org/0000-0001-5530-0674
                https://orcid.org/0000-0002-9094-1677
                https://orcid.org/0000-0002-9224-3258
                https://orcid.org/0000-0001-9090-027X
                https://orcid.org/0000-0002-7575-2085
                Article
                RA118.001218
                10.1074/mcp.RA118.001218
                6692777
                31040227
                9c64b3c9-2ddc-45d2-9751-feeac7fe0f5c
                © 2019 Verbruggen et al.

                Published by The American Society for Biochemistry and Molecular Biology, Inc.

                Author's Choice—Final version open access under the terms of the Creative Commons CC-BY license.

                History
                : 16 November 2018
                : 30 April 2019
                Funding
                Funded by: Bijzonder Onderzoeksfonds (BOF), https://dx.doi.org/10.13039/501100007229;
                Award ID: 01D20615
                Award Recipient :
                Funded by: Fonds Wetenschappelijk Onderzoek (FWO), https://dx.doi.org/10.13039/501100003130;
                Award ID: 12A7813N
                Award Recipient :
                Funded by: EC | Horizon 2020 Framework Programme (H2020), https://dx.doi.org/10.13039/100010661;
                Award ID: 803972
                Award Recipient :
                Categories
                Research

                Molecular biology
                proteogenomics,ribosomes*,tandem mass spectrometry,quality control and metrics,chromatography,mqc,prosit,proteoform,proteoformer,ribosome profiling

                Comments

                Comment on this article