15
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Equivalent Indels – Ambiguous Functional Classes and Redundancy in Databases

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          There is considerable interest in studying sequenced variations. However, while the positions of substitutions are uniquely identifiable by sequence alignment, the location of insertions and deletions still poses problems. Each insertion and deletion causes a change of sequence. Yet, due to low complexity or repetitive sequence structures, the same indel can sometimes be annotated in different ways. Two indels which differ in allele sequence and position can be one and the same, i.e. the alternative sequence of the whole chromosome is identical in both cases and, therefore, the two deletions are biologically equivalent. In such a case, it is impossible to identify the exact position of an indel merely based on sequence alignment. Thus, variation entries in a mutation database are not necessarily uniquely defined. We prove the existence of a contiguous region around an indel in which all deletions of the same length are biologically identical. Databases often show only one of several possible locations for a given variation. Furthermore, different data base entries can represent equivalent variation events. We identified 1,045,590 such problematic entries of insertions and deletions out of 5,860,408 indel entries in the current human database of Ensembl. Equivalent indels are found in sequence regions of different functions like exons, introns or 5' and 3' UTRs. One and the same variation can be assigned to several different functional classifications of which only one is correct. We implemented an algorithm that determines for each indel database entry its complete set of equivalent indels which is uniquely characterized by the indel itself and a given interval of the reference sequence.

          Related collections

          Most cited references15

          • Record: found
          • Abstract: found
          • Article: not found

          The Bioperl toolkit: Perl modules for the life sciences.

          The Bioperl project is an international open-source collaboration of biologists, bioinformaticians, and computer scientists that has evolved over the past 7 yr into the most comprehensive library of Perl modules available for managing and manipulating life-science information. Bioperl provides an easy-to-use, stable, and consistent programming interface for bioinformatics application programmers. The Bioperl modules have been successfully and repeatedly used to reduce otherwise complex tasks to only a few lines of code. The Bioperl object model has been proven to be flexible enough to support enterprise-level applications such as EnsEMBL, while maintaining an easy learning curve for novice Perl programmers. Bioperl is capable of executing analyses and processing results from programs such as BLAST, ClustalW, or the EMBOSS suite. Interoperation with modules written in Python and Java is supported through the evolving BioCORBA bridge. Bioperl provides access to data stores such as GenBank and SwissProt via a flexible series of sequence input/output modules, and to the emerging common sequence data storage format of the Open Bioinformatics Database Access project. This study describes the overall architecture of the toolkit, the problem domains that it addresses, and gives specific examples of how the toolkit can be used to solve common life-sciences problems. We conclude with a discussion of how the open-source nature of the project has contributed to the development effort.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The Universal Protein Resource (UniProt)

            The ability to store and interconnect all available information on proteins is crucial to modern biological research. Accordingly, the Universal Protein Resource (UniProt) plays an increasingly important role by providing a stable, comprehensive, freely accessible central resource on protein sequences and functional annotation. UniProt is produced by the UniProt Consortium, formed in 2002 by the European Bioinformatics Institute (EBI), the Protein Information Resource (PIR) and the Swiss Institute of Bioinformatics (SIB). The core activities include manual curation of protein sequences assisted by computational analysis, sequence archiving, development of a user-friendly UniProt web site and the provision of additional value-added information through cross-references to other databases. UniProt is comprised of three major components, each optimized for different uses: the UniProt Archive, the UniProt Knowledgebase and the UniProt Reference Clusters. An additional component consisting of metagenomic and environmental sequences has recently been added to UniProt to ensure availability of such sequences in a timely fashion. UniProt is updated and distributed on a bi-weekly basis and can be accessed online for searches or download at .
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              An ARC/Mediator subunit required for SREBP control of cholesterol and lipid homeostasis.

              The sterol regulatory element binding protein (SREBP) family of transcription activators are critical regulators of cholesterol and fatty acid homeostasis. We previously demonstrated that human SREBPs bind the CREB-binding protein (CBP)/p300 acetyltransferase KIX domain and recruit activator-recruited co-factor (ARC)/Mediator co-activator complexes through unknown mechanisms. Here we show that SREBPs use the evolutionarily conserved ARC105 (also called MED15) subunit to activate target genes. Structural analysis of the SREBP-binding domain in ARC105 by NMR revealed a three-helix bundle with marked similarity to the CBP/p300 KIX domain. In contrast to SREBPs, the CREB and c-Myb activators do not bind the ARC105 KIX domain, although they interact with the CBP KIX domain, revealing a surprising specificity among structurally related activator-binding domains. The Caenorhabditis elegans SREBP homologue SBP-1 promotes fatty acid homeostasis by regulating the expression of lipogenic enzymes. We found that, like SBP-1, the C. elegans ARC105 homologue MDT-15 is required for fatty acid homeostasis, and show that both SBP-1 and MDT-15 control transcription of genes governing desaturation of stearic acid to oleic acid. Notably, dietary addition of oleic acid significantly rescued various defects of nematodes targeted with RNA interference against sbp-1 and mdt-15, including impaired intestinal fat storage, infertility, decreased size and slow locomotion, suggesting that regulation of oleic acid levels represents a physiologically critical function of SBP-1 and MDT-15. Taken together, our findings demonstrate that ARC105 is a key effector of SREBP-dependent gene regulation and control of lipid homeostasis in metazoans.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                PLoS ONE
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, USA )
                1932-6203
                2013
                2 May 2013
                : 8
                : 5
                : e62803
                Affiliations
                [1 ]Breeding Biology and Molecular Genetics, Humboldt-Universität zu Berlin, Berlin, Germany
                [2 ]Institut für Molekularbiologie und Bioinformatik, Charité Berlin, Berlin, Germany
                Université de Nantes, France
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Conceived and designed the experiments: JA JK. Performed the experiments: JA AS. Analyzed the data: JA AS. Contributed reagents/materials/analysis tools: JA JK GB. Wrote the paper: JA JK GB.

                [¤]

                Current address: Faculty of Science and Technology, Free University of Bozen, Bolzano, Italy

                Article
                PONE-D-12-27349
                10.1371/journal.pone.0062803
                3642179
                23658777
                4dc58118-bfa1-435b-86e8-ae45f3c8c3cc
                Copyright @ 2013

                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 10 September 2012
                : 26 March 2013
                Page count
                Pages: 9
                Funding
                This study was supported by the German Research Foundation (DFG) through the Collaborative Research Centre 852 (grant no. SFB852/1) ( http://www.sfb852.de) ( http://www.dfg.de). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology
                Computational Biology
                Genomics
                Genome Analysis Tools
                Genome Databases
                Genetics
                Genetic Mutation
                Mutation Types
                Population Genetics
                Mutation
                Genomics
                Genome Analysis Tools
                Sequence Assembly Tools
                Genome Databases
                Mutation Databases
                Genome Sequencing
                Population Biology
                Population Genetics
                Mutation
                Theoretical Biology
                Computer Science
                Algorithms
                Information Technology
                Databases
                Mathematics
                Applied Mathematics
                Algorithms

                Uncategorized
                Uncategorized

                Comments

                Comment on this article