53
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Genome-wide prediction of transcription factor binding sites using an integrated model

      research-article
      1 , 2 , 1 ,
      Genome Biology
      BioMed Central

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          A new approach for genome-wide transcription factor binding site prediction is presented that integrates sequence and chromatin modification data.

          Abstract

          We present an integrated method called Chromia for the genome-wide identification of functional target loci of transcription factors. Designed to capture the characteristic patterns of transcription factor binding motif occurrences and the histone profiles associated with regulatory elements such as promoters and enhancers, Chromia significantly outperforms other methods in the identification of 13 transcription factor binding sites in mouse embryonic stem cells, evaluated by both binding (ChIP-seq) and functional (RNA interference knockdown) experiments.

          Related collections

          Most cited references33

          • Record: found
          • Abstract: found
          • Article: not found

          ChIP-seq accurately predicts tissue-specific activity of enhancers.

          A major yet unresolved quest in decoding the human genome is the identification of the regulatory sequences that control the spatial and temporal expression of genes. Distant-acting transcriptional enhancers are particularly challenging to uncover because they are scattered among the vast non-coding portion of the genome. Evolutionary sequence constraint can facilitate the discovery of enhancers, but fails to predict when and where they are active in vivo. Here we present the results of chromatin immunoprecipitation with the enhancer-associated protein p300 followed by massively parallel sequencing, and map several thousand in vivo binding sites of p300 in mouse embryonic forebrain, midbrain and limb tissue. We tested 86 of these sequences in a transgenic mouse assay, which in nearly all cases demonstrated reproducible enhancer activity in the tissues that were predicted by p300 binding. Our results indicate that in vivo mapping of p300 binding is a highly accurate means for identifying enhancers and their associated activities, and suggest that such data sets will be useful to study the role of tissue-specific enhancers in human biology and disease on a genome-wide scale.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            DNA binding sites: representation and discovery.

            G Stormo (2000)
            The purpose of this article is to provide a brief history of the development and application of computer algorithms for the analysis and prediction of DNA binding sites. This problem can be conveniently divided into two subproblems. The first is, given a collection of known binding sites, develop a representation of those sites that can be used to search new sequences and reliably predict where additional binding sites occur. The second is, given a set of sequences known to contain binding sites for a common factor, but not knowing where the sites are, discover the location of the sites in each sequence and a representation for the specificity of the protein.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              LAGAN and Multi-LAGAN: efficient tools for large-scale multiple alignment of genomic DNA.

              To compare entire genomes from different species, biologists increasingly need alignment methods that are efficient enough to handle long sequences, and accurate enough to correctly align the conserved biological features between distant species. We present LAGAN, a system for rapid global alignment of two homologous genomic sequences, and Multi-LAGAN, a system for multiple global alignment of genomic sequences. We tested our systems on a data set consisting of greater than 12 Mb of high-quality sequence from 12 vertebrate species. All the sequence was derived from the genomic region orthologous to an approximately 1.5-Mb region on human chromosome 7q31.3. We found that both LAGAN and Multi-LAGAN compare favorably with other leading alignment methods in correctly aligning protein-coding exons, especially between distant homologs such as human and chicken, or human and fugu. Multi-LAGAN produced the most accurate alignments, while requiring just 75 minutes on a personal computer to obtain the multiple alignment of all 12 sequences. Multi-LAGAN is a practical method for generating multiple alignments of long genomic sequences at any evolutionary distance. Our systems are publicly available at http://lagan.stanford.edu.
                Bookmark

                Author and article information

                Journal
                Genome Biol
                Genome Biology
                BioMed Central
                1465-6906
                1465-6914
                2010
                22 January 2010
                : 11
                : 1
                : R7
                Affiliations
                [1 ]University of California, San Diego, Department of Chemistry and Biochemistry, 9500 Gilman Drive, La Jolla CA 92093, USA
                [2 ]Ludwig Institute for Cancer Research and Department of Cellular and Molecular Medicine, UCSD School of Medicine, 9500 Gilman Drive, La Jolla, CA 92093, USA
                Article
                gb-2010-11-1-r7
                10.1186/gb-2010-11-1-r7
                2847719
                20096096
                5cce28ea-c43d-4006-903b-fc8d3896358b
                Copyright ©2010 Won et al.; license BioMed Central Ltd.

                This is an open access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 18 June 2009
                : 30 October 2009
                : 22 January 2010
                Categories
                Method

                Genetics
                Genetics

                Comments

                Comment on this article