127
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      AUGUSTUS: ab initio prediction of alternative transcripts

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          AUGUSTUS is a software tool for gene prediction in eukaryotes based on a Generalized Hidden Markov Model, a probabilistic model of a sequence and its gene structure. Like most existing gene finders, the first version of AUGUSTUS returned one transcript per predicted gene and ignored the phenomenon of alternative splicing. Herein, we present a WWW server for an extended version of AUGUSTUS that is able to predict multiple splice variants. To our knowledge, this is the first ab initio gene finder that can predict multiple transcripts. In addition, we offer a motif searching facility, where user-defined regular expressions can be searched against putative proteins encoded by the predicted genes. The AUGUSTUS web interface and the downloadable open-source stand-alone program are freely available from http://augustus.gobics.de.

          Related collections

          Most cited references19

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources

          Background In order to improve gene prediction, extrinsic evidence on the gene structure can be collected from various sources of information such as genome-genome comparisons and EST and protein alignments. However, such evidence is often incomplete and usually uncertain. The extrinsic evidence is usually not sufficient to recover the complete gene structure of all genes completely and the available evidence is often unreliable. Therefore extrinsic evidence is most valuable when it is balanced with sequence-intrinsic evidence. Results We present a fairly general method for integration of external information. Our method is based on the evaluation of hints to potentially protein-coding regions by means of a Generalized Hidden Markov Model (GHMM) that takes both intrinsic and extrinsic information into account. We used this method to extend the ab initio gene prediction program AUGUSTUS to a versatile tool that we call AUGUSTUS+. In this study, we focus on hints derived from matches to an EST or protein database, but our approach can be used to include arbitrary user-defined hints. Our method is only moderately effected by the length of a database match. Further, it exploits the information that can be derived from the absence of such matches. As a special case, AUGUSTUS+ can predict genes under user-defined constraints, e.g. if the positions of certain exons are known. With hints from EST and protein databases, our new approach was able to predict 89% of the exons in human chromosome 22 correctly. Conclusion Sensitive probabilistic modeling of extrinsic evidence such as sequence database matches can increase gene prediction accuracy. When a match of a sequence interval to an EST or protein sequence is used it should be treated as compound information rather than as information about individual positions.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            A genomic view of alternative splicing.

            Recent genome-wide analyses of alternative splicing indicate that 40-60% of human genes have alternative splice forms, suggesting that alternative splicing is one of the most significant components of the functional complexity of the human genome. Here we review these recent results from bioinformatics studies, assess their reliability and consider the impact of alternative splicing on biological functions. Although the 'big picture' of alternative splicing that is emerging from genomics is exciting, there are many challenges. High-throughput experimental verification of alternative splice forms, functional characterization, and regulation of alternative splicing are key directions for research. We recommend a community-based effort to discover and characterize alternative splice forms comprehensively throughout the human genome.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Integrating genomic homology into gene structure prediction.

              TWINSCAN is a new gene-structure prediction system that directly extends the probability model of GENSCAN, allowing it to exploit homology between two related genomes. Separate probability models are used for conservation in exons, introns, splice sites, and UTRs, reflecting the differences among their patterns of evolutionary conservation. TWINSCAN is specifically designed for the analysis of high-throughput genomic sequences containing an unknown number of genes. In experiments on high-throughput mouse sequences, using homologous sequences from the human genome, TWINSCAN shows notable improvement over GENSCAN in exon sensitivity and specificity and dramatic improvement in exact gene sensitivity and specificity. This improvement can be attributed entirely to modeling the patterns of evolutionary conservation in genomic sequence.
                Bookmark

                Author and article information

                Journal
                Nucleic Acids Res
                Nucleic Acids Research
                Nucleic Acids Research
                Oxford University Press
                0305-1048
                1362-4962
                01 July 2006
                01 July 2006
                14 July 2006
                : 34
                : Web Server issue
                : W435-W439
                Affiliations
                Institut für Mikrobiologie und Genetik, Abteilung Bioinformatik Goldschmidtstrasse 1, 37077 Göttingen, Germany
                1Institut für Informatik, Lotzestrasse 16-18 37083 Göttingen, Germany
                2Philip Morris USA, Research Center Richmond, VA 23261, USA
                Author notes
                *To whom correspondence should be addressed. Tel: +1 831 459 5232; Fax: +1 832 459 1809; Email: mstanke@ 123456gwdg.de
                Article
                10.1093/nar/gkl200
                1538822
                16845043
                5a137a2c-62e1-4cc0-9a66-8a1f812a29b0
                © 2006 The Author(s)

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commerical use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 14 February 2006
                : 21 March 2006
                : 21 March 2006
                Categories
                Article

                Genetics
                Genetics

                Comments

                Comment on this article