Rearrangements of about 2.5 kilobases of regulatory DNA located 5′ of the transcription start site of the Drosophila even-skipped locus generate large-scale changes in the expression of even-skipped stripes 2, 3, and 7. The most radical effects are generated by juxtaposing the minimal stripe enhancers MSE2 and MSE3 for stripes 2 and 3 with and without small “spacer” segments less than 360 bp in length. We placed these fusion constructs in a targeted transformation site and obtained quantitative expression data for these transformants together with their controlling transcription factors at cellular resolution. These data demonstrated that the rearrangements can alter expression levels in stripe 2 and the 2–3 interstripe by a factor of more than 10. We reasoned that this behavior would place tight constraints on possible rules of genomic cis-regulatory logic. To find these constraints, we confronted our new expression data together with previously obtained data on other constructs with a computational model. The model contained representations of thermodynamic protein–DNA interactions including steric interference and cooperative binding, short-range repression, direct repression, activation, and coactivation. The model was highly constrained by the training data, which it described within the limits of experimental error. The model, so constrained, was able to correctly predict expression patterns driven by enhancers for other Drosophila genes; even-skipped enhancers not included in the training set; stripe 2, 3, and 7 enhancers from various Drosophilid and Sepsid species; and long segments of even-skipped regulatory DNA that contain multiple enhancers. The model further demonstrated that elevated expression driven by a fusion of MSE2 and MSE3 was a consequence of the recruitment of a portion of MSE3 to become a functional component of MSE2, demonstrating that cis-regulatory “elements” are not elementary objects.
Metazoan genes, including those of humans, contain large noncoding regions that are required for viability. Sequence variations in these regions are statistically associated with human disease, but the mechanisms underlying these associations are not well understood. These regions regulate transcription and are frequently larger than the gene's transcript by an order of magnitude. In this paper we attempt to elucidate the regulatory code of these noncoding segments of DNA by means of quantitative spatially resolved gene expression data and a computational model. The expression data comes from the early embryo of the fruit fly Drosophila melanogaster. We chose a family of DNA constructs to analyze that drive very different patterns of expression when very small changes in DNA sequence are made, reasoning that this sensitivity would reveal important properties of the regulatory code. The model reproduced the training data with precision greater than the expected accuracy of the training data itself. It was able to correctly predict from DNA sequence the expression of 44 segments of DNA from many genes and species.
See how this article has been cited at scite.ai
scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.