There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

We performed a systematic analysis of gene upstream regions in the yeast genome for occurrences of regular expression-type patterns with the goal of identifying potential regulatory elements. To achieve this goal, we have developed a new sequence pattern discovery algorithm that searches exhaustively for a priori unknown regular expression-type patterns that are over-represented in a given set of sequences. We applied the algorithm in two cases, (1) discovery of patterns in the complete set of >6000 sequences taken upstream of the putative yeast genes and (2) discovery of patterns in the regions upstream of the genes with similar expression profiles. In the first case, we looked for patterns that occur more frequently in the gene upstream regions than in the genome overall. In the second case, first we clustered the upstream regions of all the genes by similarity of their expression profiles on the basis of publicly available gene expression data and then looked for sequence patterns that are over-represented in each cluster. In both cases we considered each pattern that occurred at least in some minimum number of sequences, and rated them on the basis of their over-representation. Among the highest rating patterns, most have matches to substrings in known yeast transcription factor-binding sites. Moreover, several of them are known to be relevant to the expression of the genes from the respective clusters. Experiments on simulated data show that the majority of the discovered patterns are not expected to occur by chance.

Related collections

Most cited references 26

Record: found
Abstract: found
Article: not found

TRANSFAC: a database on transcription factors and their DNA binding sites.

E. Wingender, P Dietze, H Karas … (1996)

TRANSFAC is a database about eukaryotic transcription regulating DNA sequence elements and the transcription factors binding to and acting through them. This report summarizes the present status of this database and accompanying retrieval tools.

0 comments Cited 319 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

MatInd and MatInspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data.

Kerstin Quandt, Kornelie Frech, Holger Karas … (1995)

The identification of potential regulatory motifs in new sequence data is increasingly important for experimental design. Those motifs are commonly located by matches to IUPAC strings derived from consensus sequences. Although this method is simple and widely used, a major drawback of IUPAC strings is that they necessarily remove much of the information originally present in the set of sequences. Nucleotide distribution matrices retain most of the information and are thus better suited to evaluate new potential sites. However, sufficiently large libraries of pre-compiled matrices are a prerequisite for practical application of any matrix-based approach and are just beginning to emerge. Here we present a set of tools for molecular biologists that allows generation of new matrices and detection of potential sequence matches by automatic searches with a library of pre-compiled matrices. We also supply a large library (> 200) of transcription factor binding site matrices that has been compiled on the basis of published matrices as well as entries from the TRANSFAC database, with emphasis on sequences with experimentally verified binding capacity. Our search method includes position weighting of the matrices based on the information content of individual positions and calculates a relative matrix similarity. We show several examples suggesting that this matrix similarity is useful in estimating the functional potential of matrix matches and thus provides a valuable basis for designing appropriate experiments.

0 comments Cited 293 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Article: not found

Exploring the Metabolic and Genetic Control of Gene Expression on a Genomic Scale

J. L. DeRisi (1997)

0 comments Cited 274 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

PubMed ID:: 9847082

PMC ID:: 310790

DOI:: 10.1101/gr.8.11.1202

ScienceOpen disciplines: Chemistry

Keywords: Algorithms,Gene Expression,Genes, Fungal,genetics,Genome, Fungal,Regulatory Sequences, Nucleic Acid,Saccharomyces cerevisiae

Data availability:

ScienceOpen disciplines: Chemistry

Keywords: Algorithms, Gene Expression, Genes, Fungal, genetics, Genome, Fungal, Regulatory Sequences, Nucleic Acid, Saccharomyces cerevisiae

Comments

Comment on this article

scite_

Smart Citations

Citing PublicationsSupportingMentioningContrasting

View Citations

See how this article has been cited at scite.ai

scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

Predicting gene regulatory elements in silico on a genomic scale.

Read this article at

Abstract

Related collections

iGEM

Most cited references 26

TRANSFAC: a database on transcription factors and their DNA binding sites.

MatInd and MatInspector: new fast and versatile tools for detection of consensus matches in nucleotide sequence data.

Exploring the Metabolic and Genetic Control of Gene Expression on a Genomic Scale

Author and article information

Journal

Comments

Comment on this article

Similar content 85

Cited by 62

Most referenced authors 475