62
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Thematic Minireview Series on Results from the ENCODE Project: Integrative Global Analyses of Regulatory Regions in the Human Genome

      review-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Introduction The Encyclopedia of DNA Elements (ENCODE) Project (http://www.genome.gov/10005107) is an international collaboration of research groups funded by the National Human Genome Research Institute, with the goal of delineating all functional elements encoded in the human genome (1). This project began in 2003 with a targeted analysis of a selected 1% of the human genome. The results from the pilot project were published in 2007 (2), and a second phase of funding was then provided to scale the project to the entire human genome. Genome-scale projects in ENCODE involve the identification and quantification of RNA species in whole cells and subcellular compartments, mapping of noncoding and protein-coding genes by manual review and experimental methods, delineation of chromatin and DNA accessibility, mapping of histone modifications and transcription factor-binding sites by ChIP, and measurement of DNA methylation. More recently, ENCODE has adopted additional approaches that have not yet resulted in extensive data sets, including the examination of long-range chromatin interactions, analysis of RNA-binding proteins, and validation of transcriptional enhancers and silencers. To date, >2000 data sets have been deposited for public use by the ENCODE Project at the University of California Santa Cruz (UCSC) Genome Browser (3); to encourage public use of the data sets, a “user's guide” to the ENCODE data sets has been published (4). As the second phase of the ENCODE Project nears completion, the ENCODE Consortium has prepared a large integrative manuscript that includes analyses of experiments from 147 cell types and provides a summary of their functional annotation of the human genome (5). Additionally, other more narrowly focused studies on subsets of ENCODE data have been or will soon be published; for a list of ENCODE publications, see the ENCODE tab at the UCSC Genome Bioinformatics site. Many new insights concerning the organization and function of genomic elements have come from the ENCODE Project, including the findings that most transcription factors have many thousands of binding sites in the human genome and that these binding sites are distributed non-randomly, with only approximately one-third being located near a transcription start site (5). Many of these distally located regions of transcription factor-binding sites are thought to be transcriptional enhancers. Because enhancers are far from genes, can work in either orientation, and can sometimes skip over the nearest gene, in the past, they have been difficult to characterize. However, the study of enhancers has gained enormous momentum from high throughput methods such as ChIP-seq and comprehensive analyses from genomic projects such as ENCODE and the Roadmap Epigenomics Project. Based on current estimates of up to 50,000 enhancers in any given cell type and the fact that enhancers tend to be cell type-specific, it has been estimated that there are perhaps 105–106 enhancers in the human genome. Studies indicate that the majority of enhancers are composed of transcription factor-binding sites residing within nucleosome-free regions flanked by specific patterns of histone modifications (Fig. 1). The minireview entitled “Chromatin Fingerprint of Gene Enhancer Elements” by Gabriel E. Zentner and Peter C. Scacheri reviews the types of variant and modified histones and histone-modifying complexes found at enhancers and describes subclasses of active and poised enhancers. FIGURE 1. Genome-wide characterizations of regulatory regions. Recent genome-wide ChIP-seq studies have revealed 100,00–200,000 regions of open chromatin per cell type, tens of thousands of which are marked by specifically modified histones (e.g. H3K4me1 and H3K27ac) and bound by many different site-specific transcription factors (TFs; see “Chromatin Fingerprint of Gene Enhancer Elements” by Gabriel E. Zentner and Peter C. Scacheri). Creation of these open chromatin regions requires the actions of histone-modifying complexes (HMC), chromatin-remodeling complexes (CRC), and boundary proteins such as CTCF (see “SWI/SNF Chromatin-remodeling Factors: Multiscale Analyses and Diverse Functions” by Ghia Euskirchen, Raymond K. Auerbach, and Michael Snyder and “Genome-wide Studies of CCCTC-binding Factor (CTCF) and Cohesin Provide Insight into Chromatin Structure and Regulation” by Bum-Kyu Lee and Vishwanath R. Iyer). Distal regulatory regions are thought to function by interaction of bound site-specific factors with other transcription factors bound to core promoters via looping of the intervening DNA. For a review of recent experimental and computational methods used to identify intra- and interchromosomal interactions, see “Uncovering Transcription Factor Modules Using One-dimensional and Three-dimensional Analyses” by Xun Lan, Peggy J. Farnham, and Victor X. Jin. Finally, it is becoming increasingly clear that distal regions are critically important in specifying cell fate via regulation of specific cohorts of genes (see “Transcription Factor-mediated Epigenetic Reprogramming” by Camille Sindhu, Payman Samavarchi-Tehrani, and Alexander Meissner) and that SNPs located within distal regulatory regions contribute to the development of many human diseases (see “Genome-wide Epigenetic Data Facilitate Understanding of Disease Susceptibility Association Studies” by Ross C. Hardison). RNAPII, RNA polymerase II. An emerging concept is that transcription factors bound to distal enhancer elements regulate genes by looping out the intervening DNA and interacting with other factors bound at promoter regions that can be tens to hundreds of kilobases away. It is becoming clear that not only are chromatin-remodeling complexes required to achieve and maintain the nucleosome-free regions of enhancers that are bound by the site-specific factors but that they are also involved in the formation of chromosomal loops. One such chromatin-remodeling complex is SWI/SNF, a DNA-dependent ATPase. Human SWI/SNF complexes contain 10–12 subunits, many of which have alternative forms encoded by different members of gene families, resulting in many different possible SWI/SNF complexes. Components of SWI/SNF have specific protein domains that can recognize the acetylated or methylated histones that are found at enhancer regions, thus providing anchoring to nucleosomes. SWI/SNF can also interact with a variety of site-specific DNA-binding transcription factors. The ability of the SWI/SNF complex to interact with both DNA-bound factors and nucleosomes may contribute to its ability to form or stabilize chromosomal loops. As described in the minireview by Ghia Euskirchen, Raymond K. Auerbach, and Michael Snyder entitled “SWI/SNF Chromatin-remodeling Factors: Multiscale Analyses and Diverse Functions,” changes in abundance, structure, or activity of different components can alter the function of SWI/SNF in different types of normal or diseased cells. In a recent genome-wide study of SWI/SNF components, it was found that many of the binding sites are also bound by the CCCTC-binding factor (CTCF) 2 (6). CTCF is a sequence-specific transcription factor that is thought to serve as a chromatin organizer by acting as a barrier to spreading of epigenomic marks and by associating with the nuclear matrix to form distinct chromatin domains. It can also co-localize with both enhancers and repressors and can prevent communication between distal regions and promoters. Accumulating evidence suggests that CTCF mediates many of its functions by regulating DNA looping. CTCF genomic binding patterns have recently been defined in many different cell types. Most CTCF-binding sites (perhaps those involved in setting up generalized chromosomal domains) are invariant across many cell types. However, some CTCF sites do show cell-type specificity (perhaps those most involved in gene regulation). At a subset of sites, CTCF co-localizes with cohesin, a protein involved in keeping sister chromatids together until the anaphase stage of mitosis. The intriguing idea that cohesin may also be involved in the CTCF-mediated looping of enhancer regions by keeping the two ends of distally located DNA regions together is discussed in the minireview by Bum-Kyu Lee and Vishwanath R. Iyer entitled “Genome-wide Studies of CCCTC-binding Factor (CTCF) and Cohesin Provide Insight into Chromatin Structure and Regulation.” Methods to identify the sites of intra- and interchromosomal interactions mediated by transcription factors interacting with other transcription factors or chromatin-modifying complexes are rapidly evolving. Current methods used to detect chromosomal loops are described in the minireview by Xun Lan, Peggy J. Farnham, and Victor X. Jin entitled “Uncovering Transcription Factor Modules Using One-dimensional and Three-dimensional Analyses.” Such methods include protein-directed analyses such as ChIA-PET (chromatin interaction analysis with paired-end tag sequencing), which is similar to ChIP-seq except that distal regions brought into close proximity by the factor under analysis are ligated prior to the immunoprecipitation step. Other methods such as 3C (chromosome conformation capture), 4C (circularized chromosome conformation capture), and 5C (carbon-copy chromosome conformation capture) can also detect pairs of genomic loci that are far apart on the genome but close in three-dimensional space. One of the newer methods, Hi-C, provides an unbiased identification of chromosomal interactions across the genome but does not provide information as to which factors mediate the looping. At the present time, both the experimental and analytical steps of these chromosomal interaction methods are difficult and as such are being performed only in a few laboratories. However, it is anticipated that improvements in the protocols and analysis programs will eventually move methods such as ChIA-PET and Hi-C into the toolkit of more research groups. Understanding how genomic sequence elements regulate normal development and differentiation and how variants in the genome contribute to human diseases are the leading challenges of 21st century medicine. The minireview entitled “Transcription Factor-mediated Epigenetic Reprogramming” by Camille Sindhu, Payman Samavarchi-Tehrani, and Alexander Meissner highlights an important use of genomic and epigenomic data in translational medicine. Studies showing that transcription factors play a pivotal role in regulating and maintaining cellular states suggest that it may be possible to reprogram any cell to another cell type, which could be of enormous importance for treatment of diseases that result in loss of cell function or viability. However, such studies require that we have a deep understanding of the relationships between sets of lineage-specific transcription factors and epigenomic regulators. Sindhu et al. provide a list of interactions between transcription factors and chromatin remodelers that have been identified by genomic profiling and compare the results of different types of molecular profiling of embryonic stem cells and induced pluripotent stem cells, including analysis of coding and noncoding RNAs, histone modifications, and DNA methylation. A genome-wide association study (GWAS) attempts to define SNPs that are significantly more prevalent in a disease-affected group than in a non-affected group. The National Human Genome Research Institute maintains a catalogue of GWAS results (www.genome.gov/gwastudies/). An important recent finding from the integrative analysis of the ENCODE data sets is that a majority of SNPs associated with human disease lie in or near ENCODE-defined regions that are outside of protein-coding genes (5). As reviewed in “Genome-wide Epigenetic Data Facilitate Understanding of Disease Susceptibility Association Studies” by Ross C. Hardison, the integrative analysis of ENCODE data has shown that the phenotype-associated SNPs in the GWAS catalogue are enriched in nucleosome-free regions bound by transcription factors, i.e. putative enhancer regions. Thus, high throughput genomic assays are providing significant aid to our understanding of how SNPs identified by GWAS can contribute to human disease. Of course, simply determining that a SNP falls within a region having the hallmarks of an enhancer does not identify the gene whose regulation is affected by that SNP. However, it is clear that as the field of genomics moves further into the 21st century, the combination of GWAS, ChIP-seq of transcription factors and modified histones, and application of techniques to globally map chromosomal interactions will provide important new insights into human diseases.

          Related collections

          Most cited references2

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          ENCODE whole-genome data in the UCSC Genome Browser: update 2012

          The Encyclopedia of DNA Elements (ENCODE) Consortium is entering its 5th year of production-level effort generating high-quality whole-genome functional annotations of the human genome. The past year has brought the ENCODE compendium of functional elements to critical mass, with a diverse set of 27 biochemical assays now covering 200 distinct human cell types. Within the mouse genome, which has been under study by ENCODE groups for the past 2 years, 37 cell types have been assayed. Over 2000 individual experiments have been completed and submitted to the Data Coordination Center for public use. UCSC makes this data available on the quality-reviewed public Genome Browser (http://genome.ucsc.edu) and on an early-access Preview Browser (http://genome-preview.ucsc.edu). Visual browsing, data mining and download of raw and processed data files are all supported. An ENCODE portal (http://encodeproject.org) provides specialized tools and information about the ENCODE data sets.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Diverse Roles and Interactions of the SWI/SNF Chromatin Remodeling Complex Revealed Using Global Approaches

            Introduction Chromosomes undergo a wide variety of dynamic processes including transcription, replication, repair and packaging. Each of these activities requires the recruitment and congregation of a particular set of factors and chromosomal elements. For example visualization of nascent mRNA in HeLa cells has led to a model of transcription units being clustered into “factories” thereby facilitating optimal engagement of RNA Polymerase II (Pol II) and coordination with other crucial holoenzyme complexes [1]–[3]. In addition to RNA Pol II and transcription factors, transcriptional assemblages include proteins critical to regulating chromatin. The accessibility of nuclear proteins to DNA is often controlled by ATP-dependent chromatin remodeling complexes, which are thought to play a role in a number of different cellular transactions by reshaping the epigenetic landscape. The SWI/SNF (switch/sucrose nonfermentable) chromatin remodeling proteins were first discovered in Saccharomyces cerevisiae as components of a 2 MDa complex that repositions nucleosomes for vital tasks such as transcriptional control, DNA repair, recombination and chromosome segregation [4], [5]. Mammalian SWI/SNF is comprised of approximately ten subunits and the combinations of these subunits, some of which have multiple isoforms, enable multiple varieties of SWI/SNF complexes to exist both within a given cell and across cell types [6]. Among these subunits either of the two ATPases, Brg1 or Brm, is sufficient to remodel nucleosome arrays in vitro, however maximal nucleosome remodeling activity is achieved when the SWI/SNF subunits BAF155, BAF170 and Ini1 are present in a 2∶1 stoichiometry relative to Brg1 [7]. Whereas the ATPases have an obvious catalytic function, the roles of the other SWI/SNF subunits are largely obscure. Several reports indicate that BAF155 and BAF170 provide scaffolding functions for other SWI/SNF subunits as well as regulating their protein levels [8], [9]. SWI/SNF also contains β-actin and the actin-related protein BAF53, suggesting a possible bridge to nuclear organization or signal transduction, e.g. through phosphatidylinositol signaling [10], [11]. Phosphatidylinositol 4,5-bisphosphate has been shown to bind to Brg1 and promote binding to actin filaments [12]. Mutations resulting in loss of Ini1 function are associated with rare but aggressive pediatric cancers [13], [14]. The SWI/SNF subunits Brg1 [15] and ARID1A [16]–[18] are likewise thought to have tumor suppressor roles based on mutations recovered from other tumor types. Curiously, Ini1 alone has a unique and largely undefined role in HIV-1 infection that includes binding of Ini1 to HIV-1 integrase and the cytoplasmic export of Ini1 and its incorporation into HIV-1 particles [19]–[21]. The role of SWI/SNF components in cancer and tumor suppression is poorly understood despite extensive study. Detailed investigations of individual loci have implicated SWI/SNF in various transcriptional pathways including the cell cycle and p53 signaling [22], insulin signaling [23], and TGFβ signaling [24], as well as signaling through several different nuclear hormone receptors [25]. Although in vitro experiments and single-gene studies have been informative and have laid the foundation for understanding chromatin remodeling, a global analysis of targets of SWI/SNF is expected to yield a more extensive view into the biological roles of SWI/SNF components and their involvement in human disease. In this study we present two complementary global analyses of SWI/SNF subunits to provide a more systematic view of SWI/SNF functions. First we performed ChIP-Seq with the ubiquitous SWI/SNF components Ini1, BAF155, BAF170 as well as the Brg1 ATPase. Second, in a parallel set of studies we performed mass spectrometry identification of proteins that co-immunoprecipitate with SWI/SNF components. Using our ChIP-Seq results the resulting chromosomal locations were integrated with published annotations to yield a more complete understanding of SWI/SNF on a genome-wide scale. We find SWI/SNF components frequently occupy transcription start sites (TSSs), enhancers, CTCF regions and many regions occupied by Pol II. Further analyses of the SWI/SNF regions we identified by ChIP-Seq reveals that SWI/SNF factors target genes and signaling pathways involved in cell proliferation and cancer. Our investigation of SWI/SNF protein interactions detected not only the expected co-occurrences of individual SWI/SNF factors with each other but also with cellular components such as nuclear matrix proteins, key transcription factors and centromere proteins implying a ubiquitous role in gene regulation and nuclear function. We find an overrepresentation of both SWI/SNF-associated chromosomal regions and proteins in cell cycle and chromosome organization. Collectively our results suggest that SWI/SNF is at the nexus of multiple signal transduction pathways, essential chromosomal functions and nuclear organization. Results Genome-wide mapping of SWI/SNF subunits reveals many different co-associations We identified the targets of four SWI/SNF components, Ini1 (SMARCB1), Brg1 (SMARCA4), BAF155 (SMARCC1) and BAF170 (SMARCC2), using ChIP-Seq. Chromatin complexes were isolated from HeLa S3 nuclei following independent immunoprecipitations with antibodies for each factor. Each of these antibodies was characterized by both immunoblot and mass spectrometry analyses (see Materials and Methods). Reads that mapped uniquely to the genome were retained (29–33 million reads per data set; Table 1) and significant binding regions were identified using the PeakSeq program with q-value 0.05; hypergeometric test). Enhancers are often characterized by long-range interactions. We examined the locations of SWI/SNF binding regions in the 150 kb CIITA region where numerous chromosomal looping interactions have been mapped at high resolution in HeLa cells using 3C (Chromosome Conformation Capture). Brg1 has been previously mapped at several sites in this locus in these cells [30]. Superimposition of these 3C data on our SWI/SNF ChIP-Seq data (Figure 4) reveals that all six of the 3C interacting regions in the CIITA locus (−50 kb, −16 kb, −8 kb, pIV, +40 kb and +59 kb) are bound by SWI/SNF components. Moreover certain individual SWI/SNF component binding regions that appeared initially as orphans may now be seen as part of a complete complex when joined with a distal element. For example Ini1 at pIV when joined with BAF155 and BAF170 regions at the −16 kb element forms a SWI/SNF core. Thus in the CTIIA locus SWI/SNF regions are often associated with 3C regions and many of the regions bound by individual factors may in fact be part of entire SWI/SNF complexes inside the nucleus. 10.1371/journal.pgen.1002008.g004 Figure 4 SWI/SNF signals relative to 3C (Chromosome Conformation Capture) sites in the CIITA locus. A ∼150 kb region surrounding the CIITA locus is shown with SWI/SNF signals. Chromosomal loops detected in Ni et al. [30] are displayed as brackets connecting regions that were shown to contact each other using 3C. In the absence of γ-interferon eight constitutive contacts have been observed by 3C in HeLa cells between the sites at: (−50:−8), (−50:+59), (−8:+59), (pIV:+40), (−50:pIV), (−16:pIV), (−8:+40) and (−8:pIV). CIITA contains STAT1 binding regions; for comparison, STAT1 data are also shown from ChIP-Seq signals and target regions obtained from γ-interferon-stimulated HeLa cells as we previously reported [26]. The Ini1 site at pIV, when joined with BAF155 and BAF170 regions at the −16 kb element, forms complete a SWI/SNF core. The vertical axis for each signal track is the count of the number of overlapping DNA fragments at each nucleotide position and is scaled from 0 to 40 for each track. Overall our ChIP-Seq results are summarized in Table 1, Table 2, Table 3, and Figure 3 and indicate that SWI/SNF likely contributes to gene regulation through many different avenues in light of its binding to promoters, enhancers and CTCF sites. Furthermore SWI/SNF may facilitate looping interactions among these various elements as it has been shown in vitro that SWI/SNF can interact simultaneously with multiple DNA sites and generate loops between them [31]. Interestingly we found a slightly higher presence of the SWI/SNF core at TSSs and with Pol II than the SWI/SNF union regions with these elements (Table 3). Thus a complete core of Ini1, BAF155 and BAF170 may be required for effective promoter function whereas only a subset of these factors may be required for enhancer function. Alternatively a full SWI/SNF core may be more difficult to recover from a single enhancer element as compared to a more compact promoter region due to the enhancer's presumed interaction with many different distal elements. RNA polymerases are extensively colocalized with SWI/SNF As detailed above SWI/SNF regions are enriched for Pol II. To explore the prevalence of SWI/SNF with transcriptional machinery we asked whether the converse would also be true, namely if regions bound by RNA polymerases are enriched for SWI/SNF. Indeed Pol II regions are enriched for SWI/SNF binding regions (p 0.05 hypergeometric test; Table 7). Enhancers are underrepresented in the SWI/SNF-lamin B regions relative to all SWI/SNF locations in the ENCODE regions (p 90% of our high-confidence union targets are associated with genic or regulatory regions and that 65% of Pol III and 84% of Pol II regions colocalize with at least one SWI/SNF factor (Table 4, Tables S5 and S6). Interestingly we observed that SWI/SNF components often occur independently of each other and in various configurations across the genome, and similarly our mass spectrometry data point to heterogeneity of SWI/SNF complexes. We speculate that several mechanisms may underlie these various configurations and their associated genomic features, including 1) synergism or antagonism of the individual SWI/SNF factors in influencing expression (e.g. Figure 5); 2) failure to detect individual subunits due to epitope masking as a consequence of variation with local environments; 3) the capture of incomplete complexes that may in fact be completed upon superposition of genome-wide 3C data once such data become available (e.g. Figure 4); 4) the existence of SWI/SNF sub-complexes that deviate from the conventional composition of SWI/SNF assemblies (e.g. [56]) or 5) the capture of intermediates in a multistep assembly or remodeling process. This last view is consistent with a model of stochastic assembly that may occur through intermediate interactions and that has been described for several other large, multifactor complexes such as RNA polymerases and associated transcription factors [57], spliceosomes [58], and DNA repair complexes [59]. As shown in Figure 6 SWI/SNF occurs throughout many interconnected pathways. The assembly of functional SWI/SNF complexes at many locations in the genome may require the activation of one or more of these related pathways. Consequently some of the SWI/SNF associated regions we observed may reflect constitutive binding of partially assembled complexes that may be poised to receive additional signal inputs for subsequent regulatory activity. Indeed it has been shown that SWI/SNF components are present at regulatory regions even in the absence of stimulatory conditions or tissue-specific cofactors. For example Brg1 is present constitutively at the interferon-inducible genes IFITM3 [60] and CIITA [30] in unstimulated HeLa cells, which is consistent with our own finding of Brg1 and Ini1 at IFITM3 and various combinations of BAF155, BAF170, Ini1 and Brg1 at different elements in CIITA. In solution SWI/SNF factors are associated constitutively with RelB (HEK293 cells, [61]), RelA, NFkB1 and NFkB2 (HeLa cells, this study), the glucocorticoid receptor (T4D7 cells, [62]) and estrogen receptor alpha (HeLa cells, this study and [42]; SW13 cell extracts, [63]). The prevalence of SWI/SNF and the high degree of connectivity of its overrepresented pathways implies that SWI/SNF may assist in many related processes and may even facilitate crosstalk across many constituents of the transcriptional machinery. Notably SWI/SNF binds in the genes of its own subunits (Table S19) suggesting that SWI/SNF may contribute to auto- and cross-regulation of its subunit levels. Loss-of-function of a particular subunit, as may occur in certain cancers, could initiate oscillations and alter the relative abundance of the levels of the other SWI/SNF subunits through a variety of feedback and feed-forward loops. Aberrant SWI/SNF expression has been proposed to result in new combinatorial assemblies of SWI/SNF, some of which may deleterious [64]. The gene attributes revealed by our ChIP-Seq data substantiate that SWI/SNF is proximal to targets that comprise sets of fundamental biological processes. Many of the functional categories we found to be significantly overrepresented have disease implications, especially as related to cancer (Figure S2). For example failures in DNA repair and unchecked cell cycle activity are common characteristics of pre-cancerous cells, and our SWI/SNF analyses identified the p53 and MAPK signaling pathways, which are well known for maintaining checkpoint functions. Growth dysregulation particularly in the context of hormone signaling is another common cancer phenotype. Extracellular growth signals are transduced from the cell membrane to the nucleus by the ErbB, insulin and phosphtidylinositol signaling pathways, all of which we recovered as overrepresented (Table 5). The existence of phosphoinositide signaling in the nucleus and the ability of Brg1 to act as an effector for phosphatidylinositol 4,5-bisphosphate (PIP2) raises the prospect of several levels of control of this signaling pathway with respect to SWI/SNF [65], a hypothesis that can be examined in future studies. Several of the overrepresented pathways we identified through our ChIP-Seq analyses share proteins detected in SWI/SNF co-purification experiments, thereby providing a resource to explore potential, highly-interactive network structures. For example we found that genes with products critical for ‘nucleotide excision repair’ were enriched using our SWI/SNF union list (Figure 6). Within this pathway the excision repair protein ERCC5 co-purified with both BAF155 and BAF170 in our IP (immunoprecipitation)-mass spectrometry experiments. The excision repair protein, XPC, associates with SWI/SNF in response to UV irradiation in HeLa cells, and BRCA1 and ATR also cooperate with SWI/SNF in DNA repair (Figure 7; Table S10; [66]). Thus we speculate SWI/SNF may participate in DNA repair through both transcriptional regulation as well as recruitment to regions undergoing repair. Our study uses two strategies to attempt to comprehensively collect a SWI/SNF interaction network. We limited our network to a single model system, HeLa cells, because many attributes of SWI/SNF have been documented in these cells and it has been noted that SWI/SNF associations vary by cell type [67]. We extensively collated SWI/SNF protein interactions described in the literature. This undertaking was necessary because many of the proteins described in the literature as co-associated with SWI/SNF factors are not represented in interaction databases such as BioGRID, Molecular Interactions Database (MINT), IntAct, Human Protein Reference Database (HPRD), Nuclear Protein Database (NPD) and Interologous Interaction Database (I2D). Therefore we attempted to comprehensively collect such information to overcome these limitations. In total 158 SWI/SNF interacting proteins have been described in HeLa cells (Figure 8 and Table S10), which is similar to the number of SWI/SNF interacting proteins that have been described in other cell types [67]. Published molecular associations that were not discerned here might be due to interactions that are: 1) transient or of low affinity, 2) dependent on a specific set of biochemical conditions or 3) undetectable due to masking by the presence of more abundant protein(s) of similar size. In working with protein interaction data, similar degrees of overlap have been noted when comparisons are made across data sets [68], [69] and even in a well-studied model such as yeast, mass spectrometry analyses have found a plasticity of complexes and many previously undetected interactions [70]–[72]. From the ChIP-Seq and ChIP-chip results we expected that CTCF and lamin B may be among the proteins that co-associate with SWI/SNF, however neither of these factors was recovered in any of the non-directed experiments (Table S10), including a CTCF immunoprecipitation-mass spectrometry experiment performed in HeLa cells. In addition to the above considerations one possibility is that CTCF or lamin B may associate more strongly with one of the SWI/SNF factors not studied, e.g. BAF53A or one of the BAF60 subunits. SWI/SNF is most often described in a chromatin remodeling context however data derived from a variety of sources suggests that SWI/SNF has other facets. It is possible that not all of SWI/SNF's functions involve DNA localization and therefore other types of global experiments, such as the IP-mass spectrometry, are valuable as first steps towards recognizing previously unknown roles. Unlike cytoplasmic compartments, nuclear compartments are not separated by a physical barrier but rather are functional assemblies that are typically organized around sets of molecules engaged in common functions. Data from both ChIP-Seq and IP-mass spectrometry illuminate the sectors in which SWI/SNF operates and the integration of these two methods is better than each alone for furnishing a broad comprehension of SWI/SNF action. For example ChIP-Seq enables the global identification of SWI/SNF chromosomal elements except for those regions with highly repetitive sequence such as human centromeres (Figure 2A). In this respect IP-mass spectrometry is complementary to ChIP-Seq because it strongly suggests that SWI/SNF occurs at kinetochores as evidenced by its co-purification with CENPE, NUF2, BUB1B and CLASP2 (Figure 7 and Figure 9). In addition to kinetochore proteins the SWI/SNF co-purification experiments also uncovered proteins from other substructures including centrosomes, microtubules, the nuclear periphery and PML nuclear bodies, the latter of which is characterized by cryptic foci of PML (promyelocytic leukemia protein) and has been implicated in a variety of diseases [73]. The ChIP-Seq and IP-mass spectrometry data are synergistic as well. Notably both methods found an overrepresentation of regions or proteins enriched for ‘cell cycle’ and ‘chromosome organization’. One possible inference from these studies is that SWI/SNF is well positioned to integrate signals across multiple signaling pathways both by its presence in a variety of cellular structures and its role in gene regulation through chromatin remodeling. 10.1371/journal.pgen.1002008.g009 Figure 9 Illustration showing overrepresented GO “cellular component” categories for SWI/SNF co-purifying proteins. Overrepresented GO “cellular component” categories are displayed for proteins we detected by IP-mass spectrometry. Centrosomal proteins are shaded brown, chromosomal proteins are blue, kinetochore proteins are orange and cytoskeletal proteins are green. Genes encoding starred proteins are targets of SWI/SNF as identified by ChIP-Seq. Based on these annotations SWI/SNF is associated with multiple cellular components. A fraction of SWI/SNF complexes co-associate with elements of the nuclear periphery where they are well situated to contribute to the nuclear organization and position-dependent gene expression (Figure 7; [74]). We found that in crosslinked cells SWI/SNF localizes more widely with lamin B than lamin A whereas in non-crosslinked cells SWI/SNF co-purifies with lamin A. As mentioned above lamin B may have escaped detection in SWI/SNF protein interaction studies. A related possibility is that SWI/SNF may exist in different nuclear pools that have varying solubilities and associations, such that recovery of particular SWI/SNF complexes depends upon the proteins with which SWI/SNF is associated. For example lamins A and B are known to have different nucleoplasmic mobilities and localization patterns [50], [52]. Immunolocalization experiments in HeLa nuclei have revealed that the A/C- and B-type lamins form distinct meshworks with occasional points of intersection [50], which is consistent with the interspersed patterns of lamin A/C and B that we detected (Figure 1). Hence it is reasonable to expect that SWI/SNF associated with lamin A would behave differently than when associated with lamin B. We surmise that in a chromatin context the dominant association of SWI/SNF with the nuclear lamins occurs in regions where lamin B is present. The purification of SWI/SNF with lamin A may indicate other biological roles, such as cell cycle progression or nuclear assembly [75], [76]. Gaining a more detailed understanding of SWI/SNF's activities in or near various heterochromatin environments will be central to comprehending nuclear events over the cell cycle as well as during development. Among the numerous molecular and epigenetic factors that have been found to affect heterochromatin formation or maintenance, the heterochromatin protein 1 alpha (HP1α, also known as CBX5; Figure 7) and Polycomb complexes (PcG) are of particular relevance to SWI/SNF [77]–[79]. Polycomb complexes promote gene silencing by catalyzing the trimethylation of H3K27 in its target regions, and SWI/SNF antagonizes this epigenetic silencing [80]. It is tempting to speculate that SWI/SNF found near the edges of H3K27me3 domains (Figure 1A and 1C) may be contributing to the establishment or maintenance of boundary elements. SWI/SNF may also engage in heterochromatin dynamics through its interaction with HP1α, which is often located in the centromeric regions (reviewed in [81]). Curiously HP1α interacts with the lamin B receptor [82] thus providing a potential bridge between heterochromatin and the inner nuclear membrane. Both H3K27me3 and lamin B are associated with spatially regulated genes whose conversion between active and inactive states depends on access to their regulatory regions, as may be conferred by SWI/SNF. The work presented here provides new insights into the scope of SWI/SNF's influence in gene regulation and nuclear organization. The integration of numerous studies is beginning to reveal the complexities contributing to the regulation of any given locus. Contemporary models of transcriptional control propose that a series of factors transiently associate with a regulatory region before a decisive event tilts these intermediate reactions towards a productive outcome [57], [83]. SWI/SNF may contribute to such intermediate reactions or trigger switches between inactive and active states. The capacity for SWI/SNF to preserve many aspects of homeostasis also makes it vulnerable to being ensnared for aggressive cell proliferation. Our work demonstrates that SWI/SNF in particular and perhaps chromatin remodeling proteins in general will contribute unique insights to our understanding of gene regulation and disease mechanisms through the integration of target regions, spatial positioning and functional annotations. For example the co-occurrence of SWI/SNF with centrosomes, microtubules, kinetochores and the nuclear periphery may suggest that a pool of SWI/SNF is sequestered by these structures during mitosis to assist in the post-mitotic reformation of chromosomal territories. Our collective findings help inform a comprehensive view of SWI/SNF function as well as form a valuable compendium for future studies of nuclear functions as related to chromatin remodeling. Materials and Methods Chromatin immunoprecipitations Suspension HeLa S3 cells were cultured by the National Cell Culture Center (Biovest International Inc., Minneapolis, MN) in modified minimal essential medium (MEM), supplemented with 10% FBS at 37°C in 5% CO2, to a density of 6×105 cells/mL. Cells were fixed with 1% formaldehyde at room temperature for 10 min. Fixation was terminated with 125 mM glycine (2 M stock made in 1x PBS). Formaldehyde-fixed cells were washed in cold Dulbecco's PBS (Invitrogen) and swelled on ice in a 10-mL hypotonic lysis buffer [20 mM Hepes (pH 7.9), 10 mM KCl, 1 mM EDTA (pH 8.0), 10% glycerol, 1 mM DTT, 0.5 mM PMSF, and Roche Complete protease inhibitors, Cat#1697498]. To isolate nuclei, whole cell lysates were homogenized with 30 strokes in a 7 mL Dounce homogenizer (Kontes, pestle B). Nuclear pellets were collected by centrifugation and lysed in 10 mL of RIPA buffer per 3×108 cells [RIPA buffer: 10 mM Tris-Cl (pH 8.0), 140 mM NaCl, 1% Triton X-100, 0.1% SDS, 1% deoxycholic acid, 0.5 mM PMSF, 1 mM DTT, and protease inhibitors]. Chromatin was sheared with an analog Branson 250 Sonifier (power setting 2, 100% duty cycle for 7×30-s intervals) to an average size of less than 500 bp, as verified on a 2% agarose gel. Lysates were clarified by centrifugation at 20,000× g for 15 min at 4°C. Clarified nuclear lysates from 1×108 cells were agitated overnight at 4°C with 20 µg of one of the following antibodies: 1) anti-Ini1 (C-20), Santa Cruz Biotechnology, sc-16189; 2) anti-BAF155 (H-76), Santa Cruz Biotechnology, sc-10756; 3) anti-BAF170 (H-116), Santa Cruz Biotechnology, sc-10757; 4) anti-Brg1 (G-7), Santa Cruz Biotechnology, sc-17796; 5) anti-lamin A/C (H-110), Santa Cruz Biotechnology, sc-20681; 6) anti-lamin B antibody, EMD Biosciences, NA12; or 7) normal IgG, Santa Cruz Biotechnology, sc-2025. Antibody incubations were followed by addition of either protein A (Millipore #16-156) or protein G agarose beads (Millipore #16-266). Beads were permitted to bind to protein complexes for 60 min at 4°C. Immunoprecipitates were washed three times in 1x RIPA, once in 1x PBS, and then eluted in 1xTE/1%SDS. Crosslinks were reversed overnight at 65°C. ChIP DNA was purified by incubation with 200 µg/ml RNase A (Qiagen #19101) for 1 h at 37°C, with 200 µg/ml proteinase K (Ambion AM2548) for 2 h at 45°C, phenol:chloroform:isoamyl alcohol extraction, and precipitation with 0.1 volumes of 3 M sodium acetate, 2 volumes of 100% ethanol and 1.5 µL of pellet paint (Novagen #69049-3). ChIP DNA prepared from 1×108 cells was resuspended in 50 µL of Qiagen Elution Buffer (EB). Three biological replicates were prepared per antibody. Construction and sequencing of Illumina libraries ChIP-Seq libraries were prepared and sequenced as previously described [26], [84]. Biological replicates for each factor were converted into separate and distinct libraries. To summarize, ChIP DNA samples were loaded onto Qiagen MinElute PCR columns, eluted with 15 µL of Qiagen buffer EB, size-selected in the 100–350 bp range on 2% agarose E-gels (Invitrogen) and gel-purified using a Qiagen gel extraction kit. DNA was end-repaired and phosphorylated with the End-It kit from Epicentre (Cat# ER0720). The blunt, phosphorylated ends were treated with Klenow fragment (3′ to 5′ exo minus; NEB, Cat# M0212s) and dATP to yield a protruding 3′-‘A’ base for ligation of Illumina adapters (100 RXN Genomic DNA Sample Prep Oligo Only Kit, Part# FC-102-1003), which have a single ‘T’ base overhang at the 3′ end. After adapter ligation (LigaFast, Promega Cat# M8221) DNA was PCR-amplified with Illumina genomic DNA primers 1.1 and 2.1 for 15 cycles by using a program of (i) 30 s at 98°C, (ii) 15 cycles of 10 s at 98°C, 30 s at 65°C, 30 s at 72°C, and (iii) a 5 min extension at 72°C. The final libraries were band-isolated from an agarose gel to remove residual primers and adapters. Library concentrations and A260/A280 ratios were determined by UV-Vis spectrometry on a NanoDrop ND-1000 spectrophotometer (Thermo Fisher Scientific). Purified and denatured library DNA was capture on an Illumina flowcell for cluster generation and sequenced on an Illumina Genome Analyzer II following the manufacturer's protocols [85]. Identification of proteins by mass spectrometry Immunoprecipations were performed using the same conditions as for ChIP experiments except the HeLa S3 cells were not crosslinked. In addition to the ChIP antibodies described above we also used anti-Brm, Abcam Cat# ab15597 and anti-BAF250a (PSG3), Santa Cruz Biotechnology, sc-32761. Complexes were resolved on BioRad 4–20% precast Tris-HCl gels (Cat#161-1159) such that a single gel was used for each specific antibody and normal IgG immunoprecipitation pair. Gels were silver stained using Pierce SilverSNAP stain for mass spectrometry (Cat#24600) and each lane was excised into 10–12 molecular weight regions. Gel slices were destained, dried in a Savant speed-vac and digested overnight at 42°C with Sigma's Trypsin Profile IGD kit for in-gel digests (Cat# PP0100). Following the overnight incubation the liquid was removed from each gel piece and volume reduced by drying to approximately 10 µL. The individual gel slices were analyzed separately. Mass spectrometry The samples were subjected to nanoflow chromatography using nanoAcquity UPLC system (Waters Inc.) prior to introduction into the mass spectrometer for further analysis. Mass spectrometry was performed on a hybrid ion trap LTQ Orbitrap mass spectrometer (Thermo Fisher Scientific) in positive electrospray ionization (ESI) mode. The spectra was acquired in a data dependent fashion consisting of full mass spectrum scan (300–2000 m/z) followed by MS/MS scan of the 3 most abundant parent ions. For the full scan in the orbitrap the automatic gain control (AGC) was set to 1×106 and the resolving power for 400 m/z of 30,000. The MS/MS scans were done using the ion trap part of the mass spectrometer at a normalized collision energy of 24 V. Dynamic exclusion time was set to 100 s to avoid loss of MS/MS spectral information due to repeated sampling of the most abundant peaks. Sequence data from MS/MS spectra was processed using the SEQUEST database search algorithm (Thermo Fisher Scientific). The resulting protein identifications were brought into the Scaffold visualization software (Proteome Software) where the information was further refined resulting in improved protein id conformation. Scaffold search criteria were set at 98% probability and required at least 2 unique peptides per id. Determination of enriched regions in SWI/SNF ChIP-Seq data All ChIP-Seq data sets (Ini1, Brg1, BAF155, BAF170, and Pol II) were scored against a normal IgG control using PeakSeq [26] with default parameters (q-value<0.05) to determine an initial set of enriched regions. These lists were then filtered by removing those regions that did not meet all of the following requirements: 1) the q-value from PeakSeq was further restricted to a q-value of<0.01; 2) a minimum of 20 sequencing reads per peak from the specific antibody ChIP; 3) an enrichment of 1.5-fold of the specific antibody relative to the normal IgG control; and 4) an excess of at least 10 of the specific antibody reads relative to the normal IgG control reads. Enriched regions satisfying these criteria comprised our initial list of enrichment sites for each factor (Table 1 and Tables S11, S12, S13, S14, S15, and S16). Among these data sources, Pol II and the normal IgG control have been published as part of prior studies and are available in GEO (accession numbers GSE14022 and GSE12781, respectively) [26], [84]. Data for Ini1, Brg1, BAF155 and BAF170 can be accessed through GEO series GSE24397. Generation of a SWI/SNF union list from ChIP-Seq results After obtaining our initial list of enriched regions for each factor subjected to chromatin immunoprecipitation, we generated a union list of SWI/SNF component targets. Using the method described in Euskirchen et al. [86], we formed the union of Ini1, BAF155, BAF170, and Brg1 enriched regions as identified by ChIP-Seq and merged any unioned regions that were separated by ≤100 bp. Each union region was then classified by whether it intersected with one or more of BAF155, BAF170, Ini1, and Brg1. The resulting list consists of 69,658 SWI/SNF union regions (Table S2). Determination of the “high-confidence” and “core” SWI/SNF regions from the ChIP-Seq union regions We compared our ChIP-Seq target lists for the 69,658 SWI/SNF union regions against genomic features at which chromatin remodeling is expected to play a prominent role: RNA polymerase II sites [26], 5′ ends of Ensembl protein-coding genes, CTCF sites [28], and regions predicted to be enhancers in HeLa cells [29]. We also compared individual SWI/SNF component lists against each other. Only those SWI/SNF regions which intersect another SWI/SNF component or which intersect at least one of the above genomic features were retained for the ‘high-confidence’ union list. For gene promoter regions, we define overlap as a target region with at least 1 shared bp within ±2.5 kb of the annotated transcription start site (TSS). SWI/SNF region intersections were calculated both for all genes in the Ensembl 52 database build using annotations from NCBI36 (human genome build hg18) as well as for a subset of genes that Ensembl identifies as protein-coding. The resulting target list consists of 49,555 ‘high-confidence’ SWI/SNF union regions (Table S3). Union regions containing all three of the BAF155, BAF170, and Ini1 subunits are designated as the 9,760 ‘core’ SWI/SNF regions (Table 3). Generating co-occurrence tables To determine the co-occurrences of features of interest we used a similar intersection strategy as was used for determining the high-confidence SWI/SNF regions. For all pairwise comparisons, one of the two data sets was extended by 100 bp on each side of the region and then intersected against the other, non-extended dataset. We required an overlap of at least 1 bp to deem two regions as associated. Using a Perl script, the intersection results for all comparisons were combined to form the co-occurrence table. The same procedure was followed to generate SWI/SNF-centric (Tables S2 and S3), Pol II-centric (Table S5) and Pol III-centric (Table S6) co-occurrence tables. Determination of expressed regions Using the HeLa RNA-Seq data of Morin et al. [37], we subdivided each list by the expression status of the corresponding gene targets. Expressed genes were defined as any Ensembl gene with an associated Ensembl transcript having an adjusted depth of ≥1, representing an average coverage of 1x across all bases in the transcript. A total of 9,711 expressed protein-coding genes satisfied these criteria. Comparison of expression levels associated with different SWI/SNF sub-complexes We created a series of lists based upon the combinations of SWI/SNF components that could co-occur using the 49,555 high-confidence SWI/SNF regions derived from Table S3. Using the RNA-Seq data of Morin et al. [37], we intersected each list against the 5′ ends of transcripts queried by that study and recorded the corresponding adjusted depth for any transcript with a 5′ end within ±2.5 kb of a SWI/SNF region. Morin et al. treats adjusted depth as a measurement of transcription level for the corresponding transcript. For each list, these measurements were used to build a series of violin plots showing the probability distribution of transcription levels associated with different compositions of SWI/SNF subunits. Note that each SWI/SNF region from Table S2 can only be assigned to one list (e.g. a region containing BAF155, BAF170, and Ini1 is not also assigned to the list of regions containing BAF155 and BAF170). Pathway analyses of SWI/SNF factors Overrepresented GO categories [87] and KEGG pathways [88] were determined using DAVID tools [16]. Figures S2 and S3 were drawn using KGML-ED [89]. ChIP-chip experimental procedures and array scoring The ENCODE tiling arrays (NimbleGen Systems Inc., Madison, WI) interrogate the regions from the pilot phase of the ENCODE project [90] and tile the non-repetitive forward strand DNA sequence with 50-mer oligonucleotides spaced every 38 bp (overlapping by 12 bp) for a total of approximately 390,000 features. For array hybridizations ChIP DNA samples from 1×108 cells were labeled according to the manufacturer's protocol by Klenow random priming with Cy5 nonamers (lamin A/C or lamin B ChIP DNA) or Cy3 nonamers (normal IgG ChIP DNA). Biological replicates, defined as ChIP DNA isolations prepared from distinct cell cultures, were each hybridized to separate microarrays. Each lamin data set consists of three biological replicates. ChIP DNA labeling and array hybridizations were conducted by the NimbleGen service facility (Reykjavik, Iceland). Briefly, arrays were hybridized in Maui hybridization stations for 16–18 h at 42°C, and then washed in 42°C 0.2% SDS/0.2x SSC, room temperature 0.2x SSC, and 0.05x SSC. Arrays were scanned on an Axon 4000B scanner. For each pair of arrays the files (in GFF file format) corresponding to the two channels for ChIP DNA (635 nm) and reference DNA (532 nm), were uploaded to the TileScope pipeline for normalization and scoring [91]. Data were scored with the following TileScope program parameters: quantile normalization of replicates, iterative peak identification, window size = 500, oligo length = 50, pseudomedian threshold = 1.0, p-value threshold = 4.0, peak interval = 1000, and feature length = 1000. Regions called by Tilescope were then filtered and corrected for multiple hypothesis testing by false discovery rate (FDR). To generate our set of background regions for FDR analysis, we randomly shuffle the probe values within each replicate, ensuring that the same probes are swapped for each replicate. This shuffled data set is then used as input to Tilescope and the scores compared against the lamin A/C and the lamin B data sets. The final lists of enriched regions for lamin A/C and lamin B have a final FDR of 0.1. Target coordinates were converted to hg18 using the UCSC ‘liftOver’ utility (http://genome.ucsc.edu/cgi-bin/hgLiftOver). Lamin A/C and lamin B data are available through GEO series GSE24382 and Tables S17 and S18. Comparison of features across the ENCODE regions To facilitate comparisons between sequencing and array data we retained only those regions that could be queried by both platforms. To this end, we first identified sequences represented on the ENCODE tiling array that possess less than 25% mappability in ChIP-Seq experiments using 30 bp reads. Any enriched regions in the lamin A/C and the lamin B data sets that were entirely contained within these regions of low mappability were removed from our lists, as corresponding signal levels are unlikely to be detected accurately via ChIP-Seq. Mappability was determined using a 30 bp read length and reported in 100 bp windows according to [26]. The end result is a list of lamin A/C and lamin B enriched regions identified by ChIP-chip in areas of the genome that can be queried by ChIP-Seq. Accordingly, regions that are not represented on the ENCODE tiling arrays were also removed from our SWI/SNF ChIP-Seq experiments for this comparison. Because our ChIP-Seq data covers the entire genome, we began by restricting our enriched SWI/SNF regions only to those that occur in the ENCODE pilot regions. We further refined our ChIP-Seq data set by discarding any SWI/SNF regions that occur in a region of the tiling array for which a signal level of 0 was observed via ChIP-chip. Once our SWI/SNF, lamin A/C, and lamin B lists were limited to those regions that could be queried by both platforms, we intersected the remaining lamin regions and the SWI/SNF regions using the same method that generated the all features table for enhancers, Pol II, and other elements, as described above. Similar procedures were followed for intersections with DNA replication origins identified in the ENCODE regions using tiling arrays [55]. Evaluating enrichment of SWI/SNF components with respect to other genomic features To determine whether SWI/SNF components, core regions, and union regions are enriched for factors such as enhancers, small RNAs, lamin A/C and B, CTCF sites, Pol II regions, Pol III sites, 5′ ends and DNA replication origins, we used the genome structure correction test (GSC). This test determines the significance of observations where there “exists a complex dependency structure between observations” and was specifically designed for large-scale genomic studies [27]. Given two lists of genomic regions to compare and a list of coordinates defining the overall sample space (i.e. the length of each chromosome), a p-value for the significance of the overlap of the two lists is calculated and we report this value where noted. Data deposition All data produced for this study can be accessed through GEO and accession numbers for individual series are provided in the relevant sections. Alternatively, data from the lamin ChIP-chip experiments and the Ini1, Brg1, BAF155, and BAF170 ChIP-Seq experiments can be accessed through GEO using the SuperSeries accession number GSE24398. Supporting Information Figure S1 SWI/SNF signals and target regions in the context of interferon receptor genes on chromosome 21. The coordinates shown are in hg18 and all regions were identified in HeLa cells as detailed in Table S1 and Materials and Methods. The vertical axis for each signal track is the count of the number of overlapping DNA fragments at each nucleotide position and is scaled from 0 to 40 for each track. Panel A displays a ∼370 kb region on chromosome 21 containing genes encoding cytokine receptors. Panel B displays a ∼20 kb region at the edge of a H3K27me3 domain. Panels C and D each display ∼6 kb regions around the 5′ ends of expressed genes. (EPS) Click here for additional data file. Figure S2 SWI/SNF ChIP-Seq targets and interacting proteins superimposed on KEGG ‘Pathways in Cancer’. The KEGG ‘Pathways in Cancer’ network was among those pathways overrepresented using our 49,555 SWI/SNF high-confidence union regions (Benjamini adjusted p-value<4.7×10−8). SWI/SNF ChIP-Seq targets are highlighted in yellow and SWI/SNF co-purifying proteins detected in our IP-mass spectrometry experiments are highlighted in blue. SWI/SNF co-purifying proteins reported in other studies (Table S10) are highlighted in red. Proteins or genes not detected in any known SWI/SNF studies are gray. Starred annotations were detected in both ChIP-Seq and protein interaction studies. (EPS) Click here for additional data file. Figure S3 SWI/SNF ChIP-Seq targets and interacting proteins superimposed on KEGG ‘Cell Cycle’. The KEGG ‘Cell Cycle’ network was among those pathways overrepresented using the 49,555 SWI/SNF high-confidence union regions (Benjamini adjusted p-value<3.7×10−8). SWI/SNF ChIP-Seq targets are highlighted in yellow and SWI/SNF co-purifying proteins detected in our IP-mass spectrometry experiments are highlighted in blue. SWI/SNF co-purifying proteins reported in other studies (Table S10) are highlighted in red. Starred annotations were detected in both ChIP-Seq and protein interaction studies. (EPS) Click here for additional data file. Table S1 Data sources. (DOC) Click here for additional data file. Table S2 Features associated with the full union list of 69,658 SWI/SNF ChIP-Seq targets. (TXT) Click here for additional data file. Table S3 Features associated with the high-confidence union list of 49,555 SWI/SNF ChIP-Seq targets. (XLS) Click here for additional data file. Table S4 Combinations of features associated with the 49,555 SWI/SNF ChIP-Seq targets from the union list. For each feature, 1 = present and 0 = absent. The order of features is: 1) Ini1 2) Brg1 3) BAF155 4) BAF170 5) RNA Pol II 6) CTCF 7) Enhancers 8) Five-prime ends 9) Five-prime end-expressed 10) Five-prime ends, non-expressed. (TXT) Click here for additional data file. Table S5 Features associated with the 23,320 RNA Polymerase II ChIP-Seq targets. (XLS) Click here for additional data file. Table S6 Features associated with the 478 RNA Polymerase III ChIP-Seq targets. (XLS) Click here for additional data file. Table S7 Number of transcripts associated with SWI/SNF sub-complexes (XLS) Click here for additional data file. Table S8 Complete lists of Ensembl gene IDs and overrepresented pathways associated with the 49,555 SWI/SNF ChIP-Seq union regions. (XLS) Click here for additional data file. Table S9 Complete lists of Ensembl gene IDs, peptides and overrepresented pathways resulting from mass spectrometry of immunoprecipitated proteins. (XLS) Click here for additional data file. Table S10 Data sources and brief description of interactions displayed in Figure 7. (XLS) Click here for additional data file. Table S11 Chromosomal coordinates of the 49,458 Ini1 regions. (XLS) Click here for additional data file. Table S12 Chromosomal coordinates of the 46,412 BAF155 regions. (XLS) Click here for additional data file. Table S13 Chromosomal coordinates of the 30,136 BAF170 regions. (XLS) Click here for additional data file. Table S14 Chromosomal coordinates of the 12,725 Brg1 regions. (XLS) Click here for additional data file. Table S15 Chromosomal coordinates of the 49,555 SWI/SNF union regions. (XLS) Click here for additional data file. Table S16 Chromosomal coordinates of the 23,320 RNA Polymerase II regions. (XLS) Click here for additional data file. Table S17 Chromosomal coordinates of the 1,770 lamin A/C regions. (XLS) Click here for additional data file. Table S18 Chromosomal coordinates of the 1,270 lamin B regions. (XLS) Click here for additional data file. Table S19 Genes encoding SWI/SNF subunits and the chromosomal coordinates of any of the 49,555 SWI/SNF ChIP-Seq union regions that occur in these genes. (XLS) Click here for additional data file.
              Bookmark

              Author and article information

              Journal
              J Biol Chem
              J. Biol. Chem
              jbc
              jbc
              JBC
              The Journal of Biological Chemistry
              American Society for Biochemistry and Molecular Biology (9650 Rockville Pike, Bethesda, MD 20814, U.S.A. )
              0021-9258
              1083-351X
              7 September 2012
              5 September 2012
              5 September 2012
              : 287
              : 37
              : 30885-30887
              Affiliations
              [1]From the Department of Biochemistry and Molecular Biology, University of Southern California, Los Angeles, California 90089
              Author notes
              [1 ] To whom correspondence should be addressed. E-mail: pfarnham@ 123456usc.edu .
              Article
              R112.365940
              10.1074/jbc.R112.365940
              3438920
              22451669
              a62381c7-5a7c-4980-9c58-64cd54a53f2d
              © 2012 by The American Society for Biochemistry and Molecular Biology, Inc.

              Author's Choice—Final version full access.

              Creative Commons Attribution Non-Commercial License applies to Author Choice Articles

              History
              Categories
              Minireviews

              Biochemistry
              genomics,gene transcription,chromatin immunoprecipitation (chip),chromatin regulation,transcription factors,chromatin histone modification

              Comments

              Comment on this article