2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      Comprehensive analysis of spliceosome genes and their mutants across 27 cancer types in 9070 patients: clinically relevant outcomes in the context of 3P medicine

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Related collections

          Most cited references49

          • Record: found
          • Abstract: found
          • Article: not found

          Coordinated Alterations in RNA Splicing and Epigenetic Regulation Drive Leukemogenesis

          Transcription and pre-mRNA splicing are key steps in the control of gene expression and mutations in genes regulating each of these processes are common in leukemia 1,2 . Despite the frequent overlap of mutations affecting epigenetic regulation and splicing in leukemia, how these processes influence one another to promote leukemogenesis is not understood and functional evidence that mutations in RNA splicing factors initiate leukemia does not exist. Here through analyses of transcriptomes from 982 acute myeloid leukemia (AML) patients, we identified frequent overlap of mutations in IDH2 and SRSF2 which together promote leukemogenesis through coordinated effects on the epigenome and RNA splicing. While mutations in either IDH2 or SRSF2 imparted distinct splicing changes, co-expression of mutant IDH2 altered the splicing effects of mutant SRSF2 and resulted in more profound splicing changes than either mutation alone. Consistent with this, co-expression of mutant IDH2 and SRSF2 resulted in lethal myelodysplasia with proliferative features in vivo and enhanced self-renewal in a manner not observed with either mutation alone. IDH2/SRSF2 double-mutant cells exhibited aberrant splicing and reduced expression of INTS3, a member of the Integrator complex 3 , concordant with increased stalling of RNA polymerase II (RNAPII). Aberrant INTS3 splicing contributed to leukemogenesis in concert with mutant IDH2 and was dependent on mutant SRSF2 binding to cis elements in INTS3 mRNA and increased DNA methylation of INTS3. These data identify a pathogenic cross talk between altered epigenetic state and splicing in a subset of leukemias, provide functional evidence that mutations in splicing factors drive myeloid malignancy development, and uncover spliceosomal changes as a novel mediator of IDH2-mutant leukemogenesis. Mutations in RNA splicing factors are common in cancer and impart specific changes to splicing that are identifiable by mRNA sequencing (RNA-seq) 4–6 . Somatic mutations involving the Proline 95 residue of the spliceosome component SRSF2 are among the most recurrent in myeloid malignancies and alter SRSF2’s binding to RNA in a sequence-specific manner 6,7 . We analyzed RNA-seq data from 179 AML patients from The Cancer Genome Atlas (TCGA) 1 to evaluate for spliceosomal alterations. Aberrant splicing events characteristic of SRSF2 mutations, including EZH2 6,7 poison exon inclusion, were observed in 19 patients (P = 1.6e-12; Fisher’s exact test; Fig. 1a, Extended Data Fig. 1a, b, and Supplementary Table 1). Although only one SRSF2 mutant patient was reported in the TCGA AML publication 1 , mutational analysis of RNA-seq data identified SRSF2 hotspot mutations in each of these 19 patients (19/178 = 11%). Therefore, these data retrospectively identify SRSF2 as amongst the most commonly mutated genes in the TCGA AML cohort. Interestingly, 47% of SRSF2 mutant patients had a co-existing IDH2 mutation and conversely, 56% of IDH2 mutant patients had a co-existing SRSF2 mutation (P = 1.7e-06; Fisher’s exact test; Fig. 1b, Extended Data Fig. 1c, d, and Supplementary Table 2). Similar results were seen in RNA-seq data from 498 and 263 AML patients from the Beat-AML 8 and Leucegene 9 studies, respectively (Fig. 1c, d, Extended Data Fig. 1e–j, and Supplementary Table 2). Across these datasets variant allele frequencies of IDH2 and SRSF2 mutations were high and significantly correlated (Extended Data Fig. 1k), suggesting their common placement as early events in AML. Beyond these datasets, combined IDH2 and SRSF2 mutations were identified in 5.2 – 6.2% of 1,643 unselected consecutive AML patients in clinical practice (Supplementary Table 3). Although not statistically significant, IDH2/SRSF2 double-mutant AML cases had the shortest overall survival across the four studied genotypes (Extended Data Fig. 2a). While IDH2/SRSF2 double-mutant patients were mostly intermediate cytogenetic risk, their prognosis was comparable to those with adverse cytogenetic risk (Extended Data Fig. 2b). IDH2/SRSF2 double-mutant AML patients were also significantly older than IDH2 single-mutant or IDH2/SRSF2 WT patients (Extended Data Fig. 2b; clinical and genetic features are summarized in Extended Data Fig. 2 and Supplementary Table 3). Mutations in IDH2 confer neomorphic enzymatic activity which results in the generation of 2-hydroxyglutarate (2HG) 10 . 2HG production, in turn, induces DNA hypermethylation via the competitive inhibition of αKG-dependent enzymes TET1–3. Unsupervised hierarchical clustering of DNA methylation data from the TCGA AML cohort revealed that IDH2/SRSF2 double-mutant AML cases form a distinct cluster with higher DNA methylation than IDH2 single-mutant AML (Extended Data Fig. 1l–o). Collectively, these data identify IDH2/SRSF2 double-mutant leukemia as a recurrent genetically defined AML subset with a distinct epigenomic profile. We next sought to understand the basis for co-enrichment of IDH2 and SRSF2 mutations. Although mutations in splicing factors are frequent in leukemias, to date there is no functional evidence that they can transform cells in vivo. Overexpression of IDH2 R140Q or IDH2 R172K mutants in bone marrow (BM) cells from Vav-cre Srsf2 P95H/+ or Vav-cre Srsf2 +/+ mice revealed a clear collaborative effect between mutant IDH2 and Srsf2 (Extended Data Fig. 3a). Four weeks post-transplantation, the peripheral blood (PB) of recipient mice transplanted with IDH2/Srsf2 double-mutant cells had a substantially greater percentage of GFP+ cells than in an Srsf2 WT background (Fig. 2a and Extended Data Fig. 3b, c). Moreover, these mice exhibited significant myeloid skewing, macrocytic anemia, and thrombocytopenia of greater magnitude than seen with mutant IDH2 (Extended Data Fig. 3d–h). IDH2/Srsf2 double-mutants showed no difference in plasma 2HG levels than IDH2 single-mutants (Extended Data Fig. 3i, j). Serial replating of BM cells from leukemic mice revealed markedly enhanced clonogenicity of IDH2/Srsf2 double-mutant cells compared with other genotypes, exhibiting a blastic morphology and immature immunophenotype (Extended Data Fig. 3k–m). Consistent with these in vitro results, mice transplanted with IDH2/Srsf2 double-mutant cells developed a lethal myelodysplastic syndrome (MDS) characterized by pancytopenia, macrocytosis, myeloid dysplasia, expansion of immature BM progenitors, and splenomegaly (Fig. 2b and Extended Data Fig. 3n–w). At the same time, IDH2/Srsf2 double-mutant cells were serially transplantable in sublethally irradiated recipients (Fig. 2c and Extended Data Fig. 3x), a feature not present in single-mutant controls. IDH2 single-mutant controls, in contrast, developed leukocytosis, myeloid skewing without clear dysplasia, and less pronounced splenomegaly, while Srsf2 single-mutant cells had impaired repopulation capacity. These results provide the first evidence that spliceosomal gene mutations can promote leukemogenesis in vivo. We next sought to verify the effects of mutant Idh2 and Srsf2 using models in which both mutants were expressed from endogenous loci. Mx1-cre Srsf2 P95H/+ mice were crossed to Idh2 R140Q/+ mice to generate control, Idh2 R140Q single-mutant, Srsf2 P95H single-mutant, and Idh2/Srsf2 double knock-in (DKI) mice (Extended Data Fig. 4a). As expected, 2HG levels in PB mononuclear cells were increased and 5-hydroxymethylcytosine levels in cKit+ BM cells were decreased from Idh2 single-mutant and DKI primary mice compared to controls (Extended Data Fig. 4b, c). We next performed non-competitive transplantation, wherein each mutation was induced, alone or together following stable engraftment in recipients. DKI mice showed stable engraftment overtime, similar to Idh2 single-mutant or control mice (Extended Data Fig. 4d). However, DKI mice developed a lethal MDS with proliferative features and significantly shorter survival compared to controls (Fig. 2d). In competitive transplantation, expression of mutant Idh2 R140Q rescued the impaired self-renewal capacity of Srsf2 single-mutant cells (Fig. 2e). These observations were supported by increased hematopoietic stem/progenitor cells in DKI mice compared to Srsf2 single-mutant or control mice in primary and serial transplantation (Extended Data Fig. 4e–i). These results confirm cooperativity between mutant IDH2 and SRSF2 in promoting leukemogenesis in vivo. Given prior data identifying 2HG-mediated inhibition of TET2 as a mechanism of IDH2 mutant leukemogenesis 11 , we also evaluated if loss of TET2 might promote transformation of SRSF2 mutant cells. However, deletion of Tet2 in an Srsf2 mutant background was insufficient to rescue the impaired self-renewal capacity of Srsf2 single-mutant cells (Extended Data Fig. 4j–n). Similarly, restoration of TET2 function did not affect the self-renewal capacity of Idh2/Srsf2 double-mutant cells in vivo (Extended Data Fig. 4o–r). These data indicated that the collaborative effects of mutant Idh2 and Srsf2 are not solely dependent on TET2. Consistent with this, combined Tet2/Tet3 silencing partially rescued the impaired replating capacity of Srsf2 mutant cells in vitro (Extended Data Fig. 4r, s) and the impaired self-renewal of Srsf2 mutant cells in vivo (Extended Data Fig. 4t–v). However, since FTO and ALKBH5, which play a role in RNA processing as N6-methyladenosine (m6A) RNA demethylases 12,13 , are also αKG-dependent, we investigated the effects of their loss on cooperativity with mutant Srsf2. However, collaborative effects were not observed between loss of Fto or Alkbh5 and Srsf2 P95H (Extended Data Fig. 4w, x). To understand the basis for cooperation between IDH2 and SRSF2 mutations, we next analyzed RNA-seq from the TCGA (n = 179 patients), Beat-AML (n = 498 patients), and Leucegene (n = 263 patients) cohorts in addition to two previously unpublished RNA-seq datasets targeting defined IDH2/SRSF2 genotype combinations (n = 42 patients) and the knock-in mouse models. This revealed that IDH2/SRSF2 double-mutant cells consistently harbor more aberrant splicing events than SRSF2 single-mutant cells. Moreover, IDH2 mutations alone were associated with a small but reproducible change in RNA splicing (Fig. 3a, b, Extended Data Fig. 5a–g, and Supplementary Table 4–20). In contrast, TET2/SRSF2 co-mutant AML had fewer changes in splicing than IDH2/SRSF2 co-mutant AML (Extended Data Fig. 5h–m and Supplementary Table 21, 22). The majority of splicing changes associated with SRSF2 mutations involved altered cassette exon splicing consistent with SRSF2 mutations promoting inclusion of C-rich RNA sequences 6,7 . The sequence specificity of mutant SRSF2 on splicing was not influenced by concomitant IDH2 mutations (Extended Data Fig. 5n–q) and a number of these events were validated by RT-PCR of primary AML samples from an independent cohort (Fig. 3c). Among the mis-splicing events in IDH2/SRSF2 double-mutant AML was a complex event in INTS3 involving intron retention (IR) across two contiguous introns and skipping of the intervening exon (Fig. 3b, c, Extended Data Fig. 5e–f, 5r–y, 6a–c). Aberrant INTS3 splicing was demonstrated in isogenic and non-isogenic leukemia cells with or without IDH2 and/or SRSF2 mutations (Fig. 3d and Extended Data Fig. 6d–f), and INTS3 transcripts with both IR and exon skipping resulted in nonsense-mediated decay (Extended Data Fig. 6g–j). Consistent with these observations, INTS3 protein expression was reduced in SRSF2 mutant cells (Fig. 3d, Extended Data Fig. 6e, f, k–n, and Supplementary Table 23). Moreover, silencing of INTS3 was associated with reduced protein levels of additional Integrator subunits in SRSF2 mutant AML compared to SRSF2 WT AML. Consistent with these observations, steady-state protein expression levels of Integrator subunits were correlated with one another (Extended Data Fig. 6o). Overall, these data indicate that aberrant splicing and consequent loss of INTS3 was a consistent feature of IDH2/SRSF2 double-mutant cells and associated with reduced expression of multiple Integrator subunits. We next sought to understand how IDH2 mutations, which impact the epigenome, might influence splicing catalysis. Splice site choice is influenced by cis regulatory elements engaged by RNA binding proteins as well as RNAPII elongation, which itself is regulated by DNA cytosine methylation and histone modifications 14 . We therefore generated a controlled system to dissect the contribution of RNA binding elements and DNA methylation to INTS3 IR. We constructed a minigene of INTS3 spanning exons 4 and 5 and the intervening intron 4 (Extended Data Fig. 7a–c). Transfection of this minigene into leukemia cells harboring combinations of IDH2/SRSF2 mutations revealed that INTS3 intron 4 retention is driven by mutant SRSF2 and further enhanced in the IDH2/SRSF2 double-mutant setting (Extended Data Fig. 7d). SRSF2 normally binds C- or G-rich motif sequences in RNA equally well to promote splicing 15 . Leukemia-associated mutations in SRSF2 promote its avidity for C-rich sequences while reducing the ability to recognize G-rich sequences 6,7 . Interestingly, exon 4 of INTS3 harbors the greatest number of predicted SRSF2 binding motifs over the entire INTS3 genomic region (Extended Data Fig. 7c). We evaluated the role of putative SRSF2 motifs in regulating INTS3 splicing by mutating all six CCNG motifs in exon 4 to G-rich sequences. In this G-rich version of the minigene, IR no longer occurred (INTS3-GGNG; Extended Data Fig. 7e). Conversely, when all G-rich SRSF2 motifs were converted to C-rich sequences (INTS3-CCNG), IR became evident (Extended Data Fig. 7f). These results confirmed the sequence-specific activity of mutant SRSF2 in INTS3 IR and identified a role for mutant IDH2 in regulating splicing. Given that IDH2 mutations promote increased DNA methylation and that DNA methylation can impact splicing 14 , we generated genome-wide maps of DNA cytosine methylation from AML patients across four genotypes (Supplementary Table 23). This revealed that differentially spliced events in IDH2 single-mutant as well as IDH2/SRSF2 double-mutant AML (compared to IDH2/SRSF2 WT and SRSF2 single-mutant AML) harbored significant hypermethylation of DNA. Thus regions of differential DNA hypermethylation significantly overlapped with regions of differential RNA splicing (Fig. 3e and Extended Data Fig. 7j). The above results suggest a strong link between increased DNA methylation mediated by mutant IDH2 and altered RNA splicing by mutant SRSF2. To evaluate this further, we next examined DNA methylation levels around endogenous INTS3 exon 4–6 by targeted bisulfite sequencing. This revealed increased DNA methylation at all CpG dinucleotides in this region in IDH2/SRSF2 double-mutant cells compared to control or single-mutant cells (Fig. 3f and Extended Data Fig. 7k). A functional role of DNA methylation at these sites was verified by evaluating splicing in versions of the INTS3 minigene in which each CG dinucleotide was converted to an AT to prevent cytosine methylation. In these CG to AT versions of the minigene, IDH2 mutations no longer promoted mutant SRSF2-mediated IR (Extended Data Fig. 7g–i). As further confirmation of the influence of mutant IDH2 on INTS3 splicing, cell-permeable 2HG increased INTS3 IR while treatment of IDH2/SRSF2 double-mutant cells with the DNA methyltransferase inhibitor 5-aza-2’-deoxycytidine (5-AZA-CdR) inhibited INTS3 IR (Extended Data Fig. 7l, m). Given that changes in epigenetic state may impact splicing by influencing RNAPII stalling 14,16 , we evaluated the abundance of RNAPII through ChIP-seq in isogenic SRSF2 WT and SRSF2 P95H cells as well as the primary AML patient samples. This revealed increased promoter-proximal transcriptional pausing and decreased RNAPII occupancy over gene bodies in SRSF2 mutant cells, which was further enhanced in IDH2/SRSF2 double-mutant cells (Fig. 4a, b, Extended Data Fig. 7n–q, and Supplementary Table 23). Transcriptional pausing was also evident at INTS5 and INTS14 in SRSF2 mutant cells (Extended Data Fig. 7r, s), which, in combination with aberrant splicing of several Integrator subunits (Supplementary Table 24), suggested impaired function of the entire Integrator complex in SRSF2 mutant cells. Similar to DNA cytosine methylation levels, RNAPII was more abundant over differentially spliced regions between SRSF2 single-mutant AML and SRSF2 WT AML, and further enhanced over differentially spliced regions between SRSF2 single-mutant and IDH2/SRSF2 double-mutant AML (Fig. 4c and Extended Data Fig. 7t). The above data provide further links between increased DNA cytosine methylation and RNAPII stalling with altered RNA splicing in IDH2/SRSF2 double-mutant AML. To further evaluate this model, we performed anti-RNAPII ChIP across 4,766 bp of INTS3 locus in isogenic leukemia cells (Fig. 3f). This revealed striking accumulation of RNAPII across this locus in IDH2/SRSF2 double-mutant cells. Treatment with 5-AZA-CdR significantly reduced RNAPII stalling, which was coupled with decreased aberrant INTS3 splicing (Extended Data Fig. 7k–m). These data reveal that IDH2 and SRSF2 mutations coordinately dysregulate splicing through alterations in RNAPII stalling in addition to aberrant sequence recognition of cis elements in RNA. INTS3 encodes a component of the Integrator complex that participates in small nuclear RNA (snRNA) processing 3 in addition to RNAPII pause-release 17 . Consistent with this, SRSF2 single-mutant cells had altered snRNA cleavage similar to those seen with direct INTS3 downregulation, which was exacerbated in IDH2/SRSF2 double-mutant cells (Extended Data Fig. 8a–h). Attenuation of INTS3 expression in SRSF2 mutant cells caused a blockade of myeloid differentiation, an effect further enhanced in an IDH2 mutant background (Extended Data Fig. 8i–n). Importantly, direct Ints3 downregulation in the Idh2 R140Q/+ background resulted in enhanced clonogenic capacity of cells with an immature morphology and immunophenotype (Fig. 4d and Extended Data Fig. 8o–r) and promoted clonal dominance of Idh2 mutant cells (Extended Data Fig. 9a–d). Moreover, mice transplanted with Idh2 R140Q/+/anti-Ints3 shRNA treated BM cells exhibited myeloid skewing, anemia, and thrombocytopenia (Extended Data Fig. 9e–g), and developed a lethal MDS with proliferative features, phenotypes resembling those seen in IDH2/Srsf2 double-mutant mice (Fig. 4e and Extended Data Fig. 9g, h). The defects in snRNA processing in SRSF2 single-mutant and IDH2/SRSF2 double-mutant cells were partially rescued by INTS3 cDNA expression (Extended Data Fig. 8s–x). In addition, restoration of INTS3 expression released SRSF2 single-mutant and IDH2/SRSF2 double-mutant HL-60 cells from differentiation block (Extended Data Fig. 8y, z). Xenografts of IDH2/SRSF2 double-mutant HL-60 cells demonstrated that forced expression of INTS3 induced myeloid differentiation and slowed leukemia progression in vivo (Extended Data Fig. 9j–s). Collectively, these data suggest that INTS3 loss due to aberrant splicing by mutant IDH2 and SRSF2 contributes to leukemogenesis. Although INTS3 loss resulted in measurable changes in snRNA processing, the degree of snRNA mis-processing did not have a significant impact on splicing as determined by RNA-seq of IDH2 R140Q mutant HL-60 cells with INTS3 silencing. In contrast, INTS3 depletion in these cells significantly affected transcriptional programs associated with myeloid differentiation, multiple oncogenic signaling pathways, RNAPII elongation-linked transcription, and DNA repair (Extended Data Fig. 10a–d and Supplementary Table 25). This latter association of INTS3 loss with DNA repair is potentially consistent with previous reports 18,19 . These data uncover an important role for RNA splicing alterations in IDH2 mutant tumorigenesis and identify perturbations in Integrator as a novel driver of transformation of IDH2 and SRSF2 mutant cells. However, INTS3 is not known to be recurrently affected by coding-region alterations in leukemias. We therefore evaluated INTS3 splicing across 32 additional cancer types as well as normal blood cells to evaluate if aberrant INTS3 splicing might be a common mechanism in AML. This revealed that while INTS3 mis-splicing is most evident in IDH2/SRSF2 mutant AML, INTS3 aberrant splicing is also prevalent across other molecular subtypes of AML but not present in blood cells from healthy subjects or RNA-seq data from > 7,000 samples from other cancer types (Fig. 4f and Extended Data Fig. 10e, f). To further evaluate the effects of enforced INTS3 expression in splicing WT myeloid leukemia, we utilized MLL-AF9/Nras G12D murine leukemia (RN2) cells. INTS3 overexpression reduced colony-forming capacity of RN2 cells (Extended Data Fig. 10g, h) and enhanced differentiation of RN2 cells, resulting in decelerated leukemia progression in vivo (Fig. 4g and Extended Data Fig. 10i–s). These data highlight a role for INTS3 loss in broad genetic subtypes of AML. Further efforts to determine how Integrator loss promotes leukemogenesis, and other non-mutational mechanisms mediating INTS3 aberrant splicing, will be critical. To this end, it is important to note that prior work has identified that both Integrator 17,20 as well as SRSF2 21 play a direct role in modulating transcriptional pause-release. The striking accumulation of RNAPII at certain mis-spliced loci here are consistent with recent data suggesting that mutant SRSF2 is defective in promoting RNAPII pause-release 22 . Identifying how aberrant splicing mediated by mutant SRSF2 is influenced by altered RNAPII pause release may therefore be enlightening. In addition to modifying splicing in SRSF2 mutant cells, IDH2 mutations themselves were associated with reproducible changes in splicing in hematopoietic cells. Intriguingly, there is a strong correlation between aberrant splicing in IDH2 and IDH1 mutant low-grade gliomas (LGG) (P = 2.2e-16 (binominal proportion test), Extended Data Fig. 10t–w, and Supplementary Table 26–28). A significant number of splicing events dysregulated in IDH2 mutant AML from the TCGA and Leucegene cohorts were differentially spliced in IDH2 mutant versus IDH1/2 WT LGG (P = 1.8e-09 and P = 1.3e-08, respectively; binominal proportion test). These data suggest that IDH1/2 mutations impart a consistent effect on splicing regardless of tumor type. Finally, these results have important translational implications given the substantial efforts to pharmacologically inhibit mutant IDH1/2 as well as mutant splicing factors 23,24 . The frequent co-existence of IDH2 and SRSF2 mutations underscores the enormous therapeutic potential for modulation of splicing in the ~50% of IDH2 mutant leukemia patients who also harbor a spliceosomal gene mutation. METHODS Data reporting The number of mice in each experiment was chosen to provide 90% statistical power with a 5% error level. Otherwise, no statistical methods were used to predetermine sample size. The experiments were not randomized. The investigators were not blinded to allocation during experiments and outcome assessment. Animals All animals were housed at Memorial Sloan Kettering Cancer Center (MSK). All animal procedures were completed in accordance with the Guidelines for the Care and Use of Laboratory Animals and were approved by the Institutional Animal Care and Use Committees at MSK. 6–8 week female CD45.1 C57BL/6 mice were purchased from The Jackson Laboratory (Stock No: 002014). Male and female CD45.2 Srsf2 P95H/+ conditional knock-in mice, Idh2 R140Q/+ conditional knock-in mice, and Tet2 conditional knockout mice (all on C57BL/6 background) were also analyzed and used as bone marrow donors (generation of these mice were as described 6,26,27 ). For BM transplantation assays with IDH2 overexpression, Srsf2 P95H/+ and littermate control mice were crossed to Vav-cre transgenic mice 28 . CBC analysis was performed on PB collected from submandibular bleeding, using a Procyte Dx Hematology Analyzer (IDEXX Veterinary Diagnostics). For all mouse experiments, the mice were monitored closely for signs of disease or morbidity daily and were sacrificed for visible tumor formation at tumor volume > 1 cm3, failure to thrive, weight loss > 10% total body weight, open skin lesions, bleeding, or any signs of infection. In none of the experiments were these limits exceeded. Bone marrow (BM) transplantation assays Freshly dissected femurs and tibias were isolated from Mx1-cre, Mx1-cre/Idh2 R140Q/+, Mx1-cre Srsf2 P95H/+, Mx1-cre Idh2 R140Q/+ Srsf2 P95H/+, Mx1-cre Tet2 fl/fl, or Mx1-cre Tet2 fl/fl Srsf2 P95H/+ CD45.2+ mice. BM was flushed with a 3-cc insulin syringe into cold PBS supplemented with 2% bovine serum albumin to generate single-cell suspensions. BM cells were pelleted by centrifugation at 1,500 rpm for 4 min and red blood cells (RBCs) were lysed in ammonium chloride-potassium bicarbonate lysis (ACK) buffer for 3 min on ice. After centrifugation, cells were resuspended in PBS/2% BSA, passed through a 40μm cell strainer, and counted. For competitive transplantation experiments, 0.5 × 106 BM cells from Mx1-cre, Mx1-cre Idh2 R140Q/+, Mx1-cre Srsf2 P95H/+, Mx1-cre Idh2 R140Q/+ Srsf2 P95H/+, Mx1-cre Tet2 fl/fl, or Mx1-cre Tet2 fl/fl Srsf2 P95H/+ CD45.2+ mice were mixed with 0.5 × 106 wild-type (WT) CD45.1+ BM and transplanted via tail-vein injection into 8-week old lethally irradiated (900 cGy) CD45.1+ recipient mice. The CD45.1+:CD45.2+ ratio was confirmed to be approximately 1:1 by flow cytometry analysis pre-transplant. To activate the conditional alleles, mice were treated with 3 doses of polyinosinic:polycytidylic acid (pIpC; 12mg/kg/day; GE Healthcare) every second day via intra-peritoneal injection. Peripheral blood chimerism was assessed every 4 weeks by flow cytometry. For noncompetitive transplantation experiments, 1 × 106 total BM cells from Mx1-cre, Mx1-cre Idh2 R140Q/+, Mx1-cre Srsf2 P95H/+, Mx1-cre Idh2 R140Q/+ Srsf2 P95H/+, Mx1-cre Tet2 fl/fl, or Mx1-cre Tet2 fl/fl Srsf2 P95H/+ CD45.2+ mice were injected into lethally irradiated (950 cGy) CD45.1+ recipient mice. Peripheral blood chimerism was assessed as described for competitive transplantation experiments. Additionally, for each bleeding whole blood cell counts were measured on an automated blood analyzer. Animals that were lost due to pIpC toxicity were excluded from analysis. Retroviral transduction and transplantation of primary hematopoietic cells Vav-cre Srsf2 +/+ and Vav-cre Srsf2 P95H/+ mice were treated with a single dose of 5-fluoruracil (150 mg/kg) followed by BM harvest from the femurs, tibias and pelvic bones 5 days later. RBCs were removed by ACK lysis buffer, and nucleated BM cells were transduced with viral supernatants containing MSCV-IDH2 WT/R140Q/R172K-IRES-GFP for 2 days in RPMI/20% FCS supplemented with mouse stem cell factor (mSCF, 25 ng/mL), mouse Interleukin-3 (mIL3, 10 ng/mL) and mIL6 (10 ng/mL), followed by injection of ~0.5 × 106 cells per recipient mouse via tail vein injection into lethally irradiated (950 cGy) CD45.1+ mice. Transplantation of primary BM cells with TET2 catalytic domain cDNA and anti-Ints3 or Tet3 shRNAs was similarly performed. For secondary transplantation experiments, 8-week old, lethally (900–950 cGy) or sub-lethally (450–700 cGy) irradiated C57/BL6 recipient mice were injected with 1 × 106 MDS with proliferative feature cells. IDH2 WT+Srsf2 WT and IDH2 WT+Srsf2 P95H mice were sacrificed at day 315 post-transplant to harvest BM for the serial transplantation. All cytokines were purchased from R&D Systems. Flow cytometry analyses and antibodies Surface-marker staining of hematopoietic cells was performed by first lysing cells with ACK lysis buffer and washing cells with ice-cold PBS. Cells were stained with antibodies in PBS/2% BSA for 30 minutes on ice. For hematopoietic stem/progenitor staining, cells were stained with the following antibodies: B220-APCCy7 (clone: RA3–6B2; purchased from BioLegend; catalog #: 103224; dilution: 1:200); B220-Bv711 (RA3–6B2; BioLegend; 103255; 1:200); CD3-PerCPCy5.5 (17A2; BioLegend; 100208; 1:200); CD3-APC (17A2; BioLegend; 100236; 1:200); CD3-APCCy7 (17A2; BioLegend; 100222; 1:200); Gr1-PECy7 (RB6–8C5; eBioscience; 25-5931-82; 1:500); CD11b-PE (M1/70; eBioscience; 12-0112-85; 1:500); CD11b-APCCy7 (M1/70; BioLegend; 101226; 1:200); CD11c-APCCy7 (N418; BioLegend; 117323; 1:200); NK1.1-APCCy7 (PK136; BioLegend; 108724; 1:200); Ter119-APCCy7 (BioLegend; 116223: 1:200); cKit-APC (2B8; BioLegend; 105812; 1:200); cKit-PerCPCy5.5 (2B8; BioLegend; 105824; 1:100); cKit-Bv605 (ACK2; BioLegend; 135120; 1:200); Sca1-PECy7 (D7; BioLegend; 108102; 1:200); CD16/CD32 (FcγRII/III)-Alexa700 (93; eBioscience; 56-0161-82; 1:200); CD34-FITC (RAM34; BD Biosciences; 553731; 1:200); CD45.1-FITC (A20; BioLegend; 110706; 1:200); CD45.1-PerCPCy5.5 (A20; BioLegend; 110728; 1:200); CD45.1-PE (A20; BioLegend; 110708; 1:200); CD45.1-APC (A20; BioLegend; 110714; 1:200); CD45.2-PE (104; eBioscience; 12-0454-82; 1:200); CD45.2-Alexa700 (104; BioLegend; 109822; 1:200); CD45.2-Bv605 (104; BioLegend; 109841; 1:200); CD48-Bv711 (HM48–1; BioLegend; 103439; 1:200); CD150 (9D1; eBioscience; 12-1501-82; 1:200). DAPI was used to exclude dead cells. For sorting human leukemia cells, cells were stained with a lineage cocktail including CD34-PerCP (8G12; BD Biosciences; 345803; 1:200); CD117-PECy7 (104D2; eBioscience; 25-1178-42; 1:200); CD33-APC (P67.6; BioLegend; 366606; 1:200); HLA-DR-FITC (L243; BioLegend; 307604; 1:200); CD13-PE (L138; BD Biosciences; 347406; 1:200); CD45-APC-H7 (2D1; BD Biosciences; 560178; 1:200). The composition of mature hematopoietic cell lineages in the BM, spleen and peripheral blood was assessed using a combination of CD11b, Gr1, B220, and CD3. For the hematopoietic stem and progenitor analysis, a combination of CD11b, CD11c, Gr1, B220, CD3, NK1.1, and Ter119 was stained as lineage-positive cells. All the FACS sorting was performed on FACS Aria, and analysis was performed on an LSRII or LSR Fortessa (BD Biosciences). For western blotting, DNA dot blot assays, and chromatin immunoprecipitation (ChIP) assays, the following antibodies were used: INTS1 (purchased from Bethyl laboratories; catalog #: A300–361A; dilution: 1:1,000), INTS2 (Abcam; ab74982; 1:1,000), INTS3 (Bethyl laboratories; A300–427A; 1:1,000, Abcam; ab70451; 1:1,000), INTS4 (Bethyl laboratories; A301–296A; 1:1,000), INTS5 (Abcam; ab74405; 1:1,000), INTS6 (Abcam; ab57069; 1:1,000), INTS7 (Bethyl laboratories; A300–271A; 1:1,000), INTS8 (Bethyl laboratories; A300–269A; 1:1,000), INTS9 (Bethyl laboratories; A300–412A; 1:1,000), INTS11 (Abcam; ab84719; 1:1,000), Flag-M2 (Sigma-Aldrich; F-1084; 1:1,000), Myc-tag (Cell Signaling; 2276S; 1:1,000), β-actin (Sigma-Aldrich; A-5441; 1:2,000), 5-Hydroxymehylcytosine (5hmC) (Active motif; 39769), RNA polymerase II CTD repeat YSPTSPS (phospho S2) (Abcam; ab5095), RNA polymerase II CTD repeat YSPTSPS (phospho S5) (Abcam; ab5408), and UPF1 (Abcam; ab109363; 1:1,000). Minigene assay We constructed INTS3-WT minigene spanning exons 4 to 5 of human INTS3 into pcDNA3.1(+) vector (Invitrogen) using BamHI and XhoI sites, respectively. Artificial mutations were engineered into INTS3-WT minigene using the QuikChange Site-Directed Mutagenesis Kit (Agilent) to generate INTS3-GGNG, INTS3-CCNG, INTS3-WT_CG(−) INTS3-GGNG_CG(−), and INTS3-CCNG_CG(−) minigenes, respectively, and the sequences of inserts were verified by Sanger sequencing. Plasmids (1 μg) were transfected using Lipofectamine™ LTX reagent with PLUS™ reagent (Invitrogen) including 0.2 μg of EGFP and 0.8 μg of INTS3 minigene, per well of a 6-well plate. Total RNA was extracted 48 hrs after transfection using TRIzol® reagent (Ambion), followed by DNase I treatment (Qiagen). cDNA was synthesized with an oligo-dT primer using ImProm-II™ reverse transcriptase (Promega). Radioactive PCR was done with 32P-α-dCTP, 1.25 units of AmpliTaq® (Invitrogen) and 26 cycles using primer pairs 5’-GCTTGGTACCGAGCTCGGATC-3’ (vector specific forward primer) and 5’-CAGTTCCCGTACCAACCACAC-3’ (reverse primer for INTS3 versions of minigene), or 5’-CAGTTCCATTACCAACCACAC-3’ (reverse primer for INTS3_CG(−) versions of minigene). Products were run on a 5% PAGE and the bands were quantified using a Typhoon FLA 7000 (GE Healthcare). EGFP was used as a control for transfection efficiency and exogenous EGFP was amplified using a vector specific forward primer and reverse primer on EGFP. EGFP products were loaded after we ran the INTS3 products for 20–30 min. Percentages of intron 4 retention were normalized against exogenous EGFP. Cell culture K562 (human chronic myeloid/erythroleukemia cell line) and HL-60 (human promyelocytic leukemia cell line) leukemia cells, K052 (human multilineage leukemia cell line) leukemia cells, TF1 (human erythroleukemia cell line) leukemia cells, MLL-AF9/Nras G12D murine leukemia (RN2) cells 29 , and Ba/F3 (murine pro-B cell line) cells were cultured in RPMI/10% FCS (Fetal Calf Serum, heat inactivated), RPMI/20% FCS, RPMI/10% FCS + human Granulocyte-Macrophage Colony-Stimulating Factor (GM-CSF, R&D Systems; 5 ng/mL), and RPMI/10% FCS + mIL3 (R&D Systems; 1 ng/mL), respectively. None of the cell lines above were listed in the data base of commonly misidentified cell lines maintained by ICLAC and NCBI Biosample. MSCV-IDH2 WT/R140Q/R172K-IRES-GFP, MSCV-3xFlag-INTS3-puro, MSCV-IRES-3xFlag-INTS3-mCherry, MSCV-IRES-TET2 catalytic domain cDNA-mCherry (“TET2CD”), and empty vectors of these constructs were used for retroviral overexpression studies and pRRLSIN.cPPT.PGK-mCherry.WPRE-SRSF2 WT/P95H constructs were used for lentiviral overexpression studies. TET2CD cDNA fragment with Myc tag was generated by PCR amplification using pCMVTNT-TET2CD 30 as a template and inserted in the BglII restriction sites of MSCV-IRES-mCherry. Retroviral supernatants were produced by transfecting 293 GPII cells with cDNA constructs and the packaging plasmid VSV.G using XtremeGene9 (Roche) or Polyethylenimine Hydrochloride (Polysciences, Inc.). Lentiviral supernatants were produced by similarly transfecting HEK293T cells with cDNA constructs and the packaging plasmid VSV.G and psPAX2. Virus supernatants were used for transduction in the presence of polybrene (5 μg/mL). GFP+mCherry+ double-positive HL-60 cells and mCherry+ positive K562 cells were FACS-sorted to obtain cells expressing WT/mutant IDH2 and SRSF2 in various combination. Isogenic HL-60 cells transduced with 3xFlag-tagged INTS3 or empty vector were obtained by puromycin selection (1 μg/mL). In order to let the cells fully establish epigenetic changes, they were analyzed after culture for more than 30 days. For in vitro colony-forming assays, single-cell suspension was prepared and 15,000 cells/1.5 mL were plated in triplicates in cytokine supplemented methylcellulose medium (MethoCult™ GF M3434; StemCell Technologies), and colonies were enumerated every week. For the colony-forming assays shown in Extended Data Fig. 3k, IDH2 WT+Srsf2 WT and IDH2 WT+Srsf2 P95H mice were sacrificed at day 315 post-transplant to harvest BM as controls. shRNA-mediated silencing shRNAs against human INTS3 (hINTS3), mouse Ints3 (mInts3), and mouse Tet3 (mTet3) were cloned into MLS-E-Cherry and/or MLS-E-GFP vector and those against human UPF1 (hUPF1), mouse Fto (mFto), and mouse Alkbh5 (mAlkbh5) were cloned into LT3GEPIR (pRRL) Lenti-GFP-Puro-Tet-ON all-in-one vector. The antisense sequences were: hINTS3–1: TTTTCGAAACATAACCAGGTTA; hINTS3–2: TAAATATTAGGTACAGAGGCTT; mInts3–1: TTAAAAACAATTTAAAACTCGA; mInts3–2: TACAAATGCAGACTGACAGGAA; mInts3–3: TTCTTATCCTGAAAGGAGGGGA; mInts3–4: TTTAAAACTCGATTATCTTTGC; mInts3–5: TAATCTTACAAGGTCCCGGCCA; mTet3–1: TTATTAAGACCAAACCTGGCTA; mTet3–2: TTAAATGAAGTGTAGGCCATGC; mTet3–3: TTAAATGGAATTTTAAAACTAC; mTet3–4: GCCTGTTAGGCAGATTGTTCT; mTet3–5: GCTCCAACGAGAAGCTATTTG; hUPF1–1: TGGTATTACAGTAAACCACGCA; hUPF1–2: TTGTGATTTAAACTCGTCACCA; mFto-1: TTCTAAGATATAATCCAAGGTG; mFto-2: TCTGGTTTCTGCTGTACTGGTA; mAlkbh5–1: TTGAACTGGAACTTGCAGCCGA; mAlkbh5–2: TTCATCAGCAGCATACCCACTG. mCherry+ or GFP+ cells with shRNAs against hINTS3, mInts3, or mTet3 were FACS-sorted. Semi-quantitative and quantitative RT-PCR and mRNA stability assay Total RNA was isolated using TRIzol reagent (Life Sciences) with standard RNA extraction protocol for snRNA quantification or using an RNeasy Mini or Micro kit (Qiagen) with DNase I treatment (Qiagen). For cDNA synthesis, total RNA was reverse transcribed with EcoDry kits (Random Hexamer or Oligo dT kits; Clontech), SuperScript (Invitrogen), RNA-Quant cDNA synthesis Kit (System Biosciences), or Verso cDNA Synthesis Kit (Thermo Fisher Scientific). Primers used in reverse-transcriptase polymerase chain reactions (RT-PCR) were: INTS3 – Fwd1: TGAGTCGTGATGGCATGAAT (exon 4), Rev1: TCTTCACCAGTTCCCGTACC (exon 5; for detection of intron 4 retention), Rev2: CTGCTCTTCAGGACCCACTC (exon 7; for detection of exon 5 skipping); NDUFAF6 – Fwd: GCCTGTGGCCATTGAACTAT, Rev: ACAATGCCTTGTGCTTTTCC; PHF21A – Fwd: TCCATGGCCTGGAACTTTAG, Rev: GCCAGGATGGTGTTCTTCAT; GLYR1 – Fwd: AGGTCAGGCCCAGTTCTCTT, Rev: TCACGTCTAAGCGTCCAGTGFIGAPDH – Fwd: GCAAATTCCATGGCACCGTC, Rev: TCGCCCCACTTGATTTTGG. The PCR cycling conditions (33 cycles) chosen were as follows: (1) 30 s at 95 °C (2) 30 s at 60 °C (3) 30 s at 72 °C with a final 5-min extension at 72 °C. Reaction products were analyzed on 2% agarose gels. The bands were visualized by ethidium bromide staining. Quantitative real-time reverse transcriptase PCR (qPCR) analyses were performed on an Applied Biosystems QuantStudio 6 Flex cycler using SYBR Green Master Mix (Roche). The following primers were used: hINTS3 – Fwd2: CTGCAGGATACCTGCCGTA (exon 4), Rev3: CTTTCCCGTTCCTGACAGAG (intron 5; for specific quantification of transcript with intron 4 retention); Fwd1: TGAGTCGTGATGGCATGAAT (exon 4), Rev4: GGCTGTAACATCTCCACCTGA (exon 4–6; for specific quantification of transcript with exon 5 skipping); Fwd3: GGGCAATGCTGAGAGAGAAG (exon 14), Rev5: TGCCTCTGCATTGTCATAGC (exon 15); mInts3 – Fwd: GTGGCTGTTATTGACTCTGCAC, Rev: CAGGTTCCCCATCATCACAT; mFto – Fwd: CACTTGGCTTCCTTACCTGACCCCC, Rev: GGTATGCTGCCGGCCTCTCGG; mAlkbh5 – Fwd: CGGCCTCAGGACATTAAGGA, Rev: TCGCGGTGCATCTAATCTTG; Total U2snRNA – Fwd: CTTCTCGGCCTTTTGGCTAAGAT, Rev: GTACTGCAATACCAGGTCGATGC; Uncleaved U2snRNA – Fwd: ACGTCCTCTATCCG+AGGACAATA, Rev: GCAGGTGCTACCGTCTCTCAC; Total U4snRNA – Fwd: GCAGTATCGTAGCCAATGAGGTCTA, Rev: CCAGTGCCGACTATATTGCAAGTC; Uncleaved U4snRNA – Fwd: CGTAGCCAATGAGGTCTATCCG, Rev: CCTCTGTTGTTCAACTGCAAGAAA; hGAPDH–Fwd: GCAAATTCCATGGCACCGTC, Rev: TCGCCCCACTTGATTTTGG; mGapdh – Fwd: TGGAGAAACCTGCCAAGTATG, Rev: GGAGACAACCTGGTCCTCAG. All samples, including the template controls were assayed in triplicate. The relative number of target transcripts was normalized to the housekeeping gene found in the same sample. The relative quantification of target gene expression was performed with the standard curve or comparative cycle threshold (CT) method. mRNA stability assay was performed as previously described 6 . Briefly, anti-UPF1 shRNA- or control shRNA lentivirus-infected K562 SRSF2P95H knock-in cells were generated by puromycin selection (1 μg/mL) for 7 days and shRNAs against UPF1 were expressed by doxycycline (2 μg/mL) for 2 days. GFP (shRNA)-positive cells were FACS sorted, treated with 2.5 μg/ml Actinomycin D (Life Technologies), and harvested at 0, 2, 4, 8, and 12 hrs. Chromatin immunoprecipitation (ChIP) Cells were crosslinked and collected. Chromatin was broken down into 200 – 1000 bp fragments using an E220 Focused-ultrasonicator. An antibody was added into the lysate and incubated overnight at 4 °C. Twenty microliters of ChIP–grade Protein A/G Dynabeeds was added into each IP tube and incubated for 2 hours. IP samples were washed and crosslinks reversed by adding proteinase K and incubating overnight at 65 °C. DNA was purified with AMPureXP beads and eluted DNA was subjected to qPCR to measure the enrichment. RNA polymerase II antibody (05–623; EMD Millipore, Billerica, MA, USA) was used in this study. Primer sequences used for ChIP-PCR were as follows: Intron 3–1 – Fwd: atacccggcccttgctatac, Rev: gcaacttccttagcctgctg; Intron 3–1 – Fwd: atacccggcccttgctatac, Rev: gcaacttccttagcctgctg; Intron 3–2 – Fwd: ctggcaggtgaaaagcagat, Rev: ggcaggggagagaaaagc; Intron 3–3 – Fwd: agcaggcttttctgcctcat, Rev: tttctttccacaggggtcct; Exon 4 – Fwd: cgggacttagctctggtgag, Rev: cctgagtacggcaggtatcc; Intron 4 – Fwd: ctctgtcaggaacgggaaag, Rev: tgtgagtttgagaagggagcta; Exon 5 – Fwd: acgggaactggtgaagagtg, Rev: ctgggctctcctcctttctt; Intron 5–1 – Fwd: ctccacccccattatctgaa, Rev: aaatgtcagggtctgttctgtg; Intron 5–2 – Fwd: tcggtgacatctgtctgagc, Rev: cagtgggctaatggtgaggt; Intron 5–3 – Fwd: aacactgatgctcctgttttga, Rev: actatgccttgccccaggt; Intron 5–4 – Fwd: gctgttgtcagccacctgta, Rev: tttggcccttgaaaatgaac; Intron 5–5 – Fwd: tgtgttaattctgccccaca, Rev: ggatgtcctgagtcctgcac; Intron 5–6 – Fwd: gtaatgggatggcagtcagg, Rev: cctgatttcaaaaggggaaa; Exon 6 – Fwd: agcaaaggtagcatccacca, Rev: cttgcctccccctctctaac; Intron 6–1 – Fwd: tttgatccagacctccttgg, Rev: gcaggggagaaaaggatacc; Intron 6–2 – Fwd: gggggtacatattgggcttt, Rev: gaaagcctcacctccaaaca; Intron 6–3-CTCF binding site–Fwd: ctcctcccaacgttcacact, Rev: atccgtgcccagagcacta; Intron 6–4 – Fwd: agggggcctttcaactctt, Rev: atggggacaggacgtatttg; Intron 6–5 – Fwd: ttccctgccttccaacag, Rev: tcccagttgctttaaaaggagt. ChIP-seq libraries were prepared as previously described 31 and sequenced by the Integrated Genomics Operation (IGO) at MSK with 50 bp paired-end reads. ChIP-sequencing of primary human AML samples ChIP was performed as previously described 32 using the following antibodies: RNAPolII-Ser2P antibody - ChIP Grade (Abcam ab5095), RNAPlI-Ser5P antibody [4H8] (Abcam ab5408), and anti-HP1γ antibody, clone 42s2 (05–690 from Merck Millipore). Libraries were size selected with AMPure beads (Beckman Coulter) for 200–800 base pair size range and quantified by qPCR using a KAPA Library Quantification Kit. ChIP-seq data were generated using the NextSeq platform from Illumina with 2 × 75 bp Hi Output (all samples pooled, and sequenced on four consecutive runs before merger of FASTQ files). Histological analyses Mice were sacrificed and autopsied, and dissected tissue samples were fixed in 4% paraformaldehyde, dehydrated, and embedded in paraffin. Paraffin blocks were sectioned at 4 μm and stained with hematoxylin and eosin (H&E). Images were acquired using an Axio Observer A1 microscope (Carl Zeiss) or scanned using a MIRAX Scanner (Zeiss). Patient Samples Studies were approved by the Institutional Review Boards of Memorial Sloan Kettering Cancer Center (under MSK IRB protocol 06–107), Université Paris-Saclay (under declaration DC-200–725 and authorization AC-2013–1884), and the University of Manchester (institution project approval 12-TISO-04), and conducted in accordance with the Declaration of Helsinki protocol. Written informed consent was obtained from all participants. Manchester samples were retrieved from the Manchester Cancer Research Centre Haematological Malignancy Tissue Biobank, which receives sample donations from all consenting leukemia patients presenting to The Christie Hospital (REC Reference 07/H1003/161+5; HTA license 30004; instituted with approval of the South Manchester Research Ethics Committee). Patient samples were anonymized by the Hematologic Oncology Tissue Bank of MSK, Biobank of Gustave Roussy, and the Manchester Cancer Research Centre Haematological Malignancy Tissue Biobank. Mutational analysis of patient samples Genomic DNA is routinely extracted from mononuclear cell samples submitted to the Manchester Cancer Research Centre Haematological Tissue Biobank. Targeted sequencing for recurrent myeloid mutations, using either: (a) a 54 gene panel (TruSight™ Myeloid; Illumina), pooling 96 samples with 5% PhiX onto a single NextSeq high output, 2 × 151 bp sequencing run; VCF files were analyzed using Illumina’s Variant Studio software; (b) a 40 gene panel (Oncomine Myeloid Research Assay; ThermoFisher), processing eight samples per Ion 530 chip on the IonTorrent platform; data analysis performed using the Ion Reporter software; (c) a 27 gene custom panel (48 × 48 Access Array; Fluidigm) sequenced by Leeds HMDS on the MiSeq platform (300v2); or (d) MSK HemePACT 33 targeting all coding regions of 585 genes known to be recurrently mutated in leukemias, lymphomas, and solid tumors. All panels provide sufficient coverage to detect minimum variant allele fraction 5% for all genes, except for the Access Array panel and SRSF2; all samples genotyped by this approach underwent manual Sanger sequencing of SRSF2 exon 1 using the following primers (tagged with Fluidigm Access Array sequencing adaptors CS1/CS2): Fwd: acactgacgacatggttctacacccgtttacctgcggctc, Rev: tacggtagcagagacttggtctccttcgttcgctttcacgacaa. Statistics and reproducibility Statistical significance was determined by (1) unpaired two-sided Student’s t-test after testing for normal distribution, (2) one-way or two-way ANOVA followed by Tukey’s, Sidak’s, or Dunnett’s multiple comparison test, or (3) Kruskal-Wallis tests with uncorrected Dunn’s test where multiple comparisons should be adjusted (unless otherwise indicated). Data were plotted using GraphPad Prism 7 software as mean values, with error bars representing standard deviation. For categorical variables, statistical analysis was done using Fisher’s exact test or Chi-square test (two-sided). Representative WB and PCR results are shown from three or more than three biologically independent experiments. Representative flow cytometry results and cytomorphology are shown from biological replicates (n ≥ 3). *P, **P, and ***P represent *P 100 bp. These fragments were then amplified with PCR (15 cycles) and separated by gel electrophoresis (2% agarose). 300-bp DNA fragments were isolated and sequenced on an Illumina HiSeq 2000 (~100M 101 bp reads per sample). Primary samples from the Manchester Cancer Research Centre Haematological Malignancies Biobank with known IDH2/SRSF2 mutation genotype were FACS sorted to enrich for blasts on a FACS Aria III sorter using a panel including the following antibodies (all mouse anti-human): CD34-PerCP (8G12, BD); CD117-PECy7 (104D2, eBioscience); CD33-APC (P67.6, BioLegend); HLA-DR-FITC (L243, BioLegend); CD13-PE (L138, BD); CD45-APC-H7 (2D1, BD). RNA was extracted immediately using a Qiagen Micro RNeasy kit. All RNA samples had RIN values > 8. Poly(A)-selected, strand-specific SureSelect (Agilent) mRNA libraries were prepared using 200 ng RNA according to the manufacturer’s protocol. Libraries were pooled and sequenced (2 × 101 bp paired end) to > 100 million reads per sample on two HiSeq 2500 high throughput runs before retrospective merger of FASTQ files for downstream alignment and splicing analysis as described below. Transcriptional analysis was done using gene set enrichment analysis (GSEA) 34 . Publicly available RNA-sequencing data Unprocessed RNA-sequencing (RNA-seq) reads of TCGA and Leucegene datasets (human AML patients) were downloaded from NCI’s Genomic Data Commons Data Portal (GDC Legacy Archive; TCGA-LAML dataset) and NCBI’s Sequence Read Archive (SRA; accession numbers SRP056295). The TCGA dataset consists of paired-end 2 × 50 bp libraries, with an average read count of 76.92 M. The Leucegene dataset consists of paired-end 2 × 100 bp libraries, with an average read count of 50.40 M per sample. The RNA-seq samples in the Leucegene dataset have 1~3 sequencing runs (~50 M each run), and only one run was used to represent each RNA-seq sample. Genome and splice junction annotations Human assembly hg38 (GRCh38) and Ensembl database (human release 87) were used as the reference genome and gene annotation, respectively. RNA-seq reads were aligned by using 2-pass STAR 2.5.2a 35 . Known splice junctions from the gene annotation and new junctions identified from the alignments of the TCGA dataset were combined to create the database of alternative splicing events for splicing analysis. Mutational analysis for the RNA-seq data Samtools (1.3.1) were used to generate variant call format (VCF) files for 7 target genes: IDH1, IDH2, TET2, SF3B1, SRSF2, U2AF1, and ZRSR2 with mpileup parameters (-Bvu). The VCF files were further processed by our in-house scripts to filter out mutations whose VAF was lower than 15%. The filtered VCF files were used for variant effect predictor (version 89.4) to annotate the consequences of the mutations. We defined “control” patient samples as those without mutations in the 7 target genes, IDH2 mutated samples as those with only IDH2 mutations but no mutations in the other 6 target genes, SRSF2 mutated samples as those with only SRSF2 mutations but no mutations in the other 6 target genes, Double-mutant samples as those with both IDH2 and SRSF2 mutations but no mutations in the other 5 target genes, and “Others” as those with mutations in IDH1, TET2, SF3B1, U2AF1, and ZRSR2. Identification and quantification of differential splicing The inclusion ratios of alternative exons or introns were estimated by using PSI-Sigma 25 . Briefly, the new PSI index considers all isoforms in a specific gene region and can report the PSI value of individual exons in a multiple-exon-skipping or more complex splicing event. The database of splicing events was constructed based on both gene annotation and the alignments of RNA-seq reads. A new splicing event not known to the gene annotation is labeled as “Novel” and a splicing event whose reference transcript is known to induce nonsense-mediated decay is labeled as “NMD” in Supplementary Tables. The inclusion ratio of an intron retention isoform is estimated based on the median of 5 counts of intronic reads at the 1st, 25th, 50th, 75th, and 99th percentiles in the intron. A splicing event is reported when both sample-size and statistical criteria are satisfied. The sample-size criterion requires a splicing event to have more than 20 supporting reads in more than 75% of the two populations in the comparison. For example, for a comparison of 130 control versus 6 IDH2 mutant samples, a splicing event would be reported only when having more than 98 controls and 5 IDH2 mutant samples with more than 20 supporting reads. In addition, a splicing event is reported only when it has more than 10% PSI change in the comparison and has a P-value lower than 0.01. To generate Fig. 4f, RNA-seq reads were mapped and PSI values were calculated using junction-spanning reads as previously described 36,37 . All reads mapping to the INTS3 introns (chr1:153,718,433–153,722,231; hg19) were extracted from the bam files and the per-nucleotide coverage was calculated. Data from normal peripheral blood and BM mononuclear cells and CD34+ cord blood cells are combined and shown as normal hematopoietic cells. Motif enrichment and distribution Motif analysis was done by using MEME SUITE 38 . Briefly, the sequences of alternative exons of exon-skipping events were extracted from a given strand of the reference genome. The sequences were used as the input for MEME SUITE to search for motifs. One occurrence per sequence was set to be the expected site distribution. The width of motif was set to 5. The top 1 motif was selected based on the ranking of E-value. Heatmap and sample clustering (differential splicing) The heatmaps and sample clustering were done by using MORPHEUS (software.broadinstitute.org/morpheus/). The individual values in the matrix for the analysis were PSI values of a splicing event from a given RNA-seq sample. Splicing events were selected based on three criteria: (1) present in both TCGA and Leucegene datasets; (2) more than 15% PSI changes; and (3) false discovery rate smaller than 0.01. Unsupervised hierarchical clustering was based on one minus Pearson’s correlation (complete linkage). Correlation between global changes in splicing and DNA methylation DNA methylation levels were determined by enhanced reduced representation bisulfite sequencing (eRRBS) while differentially spliced events were obtained from RNA-seq data. In Fig. 3e, Overlaps of differentially methylated regions of DNA with differential splicing was obtained by evaluating differential cytosine methylation in 500 bp segments of DNA at genomic coordinates at which differential RNA splicing were observed comparing AML with distinct IDH2/SRSF2 genotypes shown (“WT” represents patients without mutations in IDH1/IDH2/Spliceosomal genes). DATA ABAILABILITY STATEMENT The data that support the findings of this study are available from the corresponding author upon reasonable request. RNA-seq, ChIP-seq, and eRRBS data have been deposited in NCBI Sequence Read Archive (SRA) under accession number SRP133673. Gel source data can be found in Supplementary Fig. 1. Other data that support this study’s findings are available from the authors upon reasonable request. Extended Data Extended Data Fig. 1 | Mutant SRSF2-mediated splicing events in acute myeloid leukemia (AML). a, Representative Sashimi plots of RNA-seq data from the TCGA showing the poison exon inclusion event in EZH2 (“Control” represents samples that are wild-type (WT) for the following 7 genes: IDH1, IDH2, TET2, SRSF2, SF3B1, U2AF1, and ZRSR2; “IDH2 mutant” refers to patients with an IDH2 mutation and no mutation in the other 6 genes; “SRSF2 mutant” refers to patients with an SRSF2 mutation and no mutation in the other 6 genes; “Double-mutant” refers to patients with an IDH2 and SRSF2 mutation and no mutation in the other 5 genes; “Others” refers to patients with mutations in IDH1, TET2, SF3B1, U2AF1, or ZRSR2; figure made using Integrative Genomics Viewer (IGV 2.3) 39 ). b, ΔPSI (Percent-Spliced-In) values of EZH2 poison exon inclusion (the number of analyzed patients is indicated; the mean ± s.d.; one-way ANOVA with Tukey’s multiple comparison test; Note that patients classified as “Others” include one SRSF2 P95L mutant patient with coexisting IDH1 R132G mutation (TCGA ID: 2990) and one IDH2 R140Q mutant patient with an SF3B1 K666N mutation (TCGA ID: 2973), which were excluded from the analyses shown above. c, d, g, h, i, j, Variant allele frequencies (VAFs) of SRSF2 mutations affecting the Proline 95 residue (c, h, j) and IDH2 mutations affecting IDH2 Arginine 140 or 172 (d, g, i) in TCGA (c, d), Beat-AML (g, h), and Leucegene (i, j) datasets (the mean ± s.d.; a two-sided Student’s t-test). e, f, Heat map based on the ΔPSI of mutant SRSF2-specific splicing events in AML from Beat-AML (e) and Leucegene (f) cohorts. “8aa DEL” represents samples with 8 amino acid deletions in SRSF2 starting from Proline 95, which has similar effects on splicing as point mutations affecting SRSF2 P95. Detailed information of splicing events shown is available in Supplementary Table 1. k, Variant allele frequencies (VAFs) of IDH2 (x-axis) and SRSF2 mutations (y-axis) in IDH2/SRSF2 double-mutant AML determined by RNA-seq data from the TCGA, Beat-AML, Leucegene, and our previously unpublished cohorts (Pearson correlation coefficient; P-value (two-tailed) was calculated by Prism7). l, n, Unsupervised hierarchical clustering of DNA methylation levels of all probes (l) or at the promoter probes (n) in the TCGA AML cohort based on IDH2/SRSF2/TET2 genotypes. m, o, DNA methylation levels of AML samples from each genotype are quantified and visualized from l and n as violin plots (the mean represented by the line inside the box and the box expands from the 25th to 75th percentiles with whiskers drawn to 2.5 and 97.5 percentiles; one-way ANOVA with Tukey’s multiple comparison test).**P 10% and P 10% and P 10% and P 10% and P < 0.01 were used as thresholds (n = 849 and n = 433 differentially spliced events, respectively; RNA-seq data were analyzed using PSI-Sigma). v, Percentage of each class of alternative splicing event in IDH2 (left) and IDH1 (right) mutant LGG is shown in pie-chart. w, Venn diagram of numbers of alternatively spliced events from the LGG TCGA dataset based on IDH1/IDH2 mutant genotypes. “Control” represents LGG with wild-type IDH1 and IDH2. *P < 0.05; **P < 0.01; ***P < 0.001. Supplementary Material SI Guide Supplementary Figure 1 Supplementary Tables Supplementary Table 1 | List of mutant SRSF2-specific splicing events in Fig. 1a and Extended Data Fig. 1e, 1f. Supplementary Table 2 | List of IDH1/IDH2/TET2 and spliceosomal mutant AML patients in TCGA, Beat-AML, and Leucegene cohorts. Supplementary Table 3 | Clinical data of AML and MDS/MPN patients from the Leeds Diagnostic Service and Christie Biobank. Supplementary Table 4 | Aberrant Splicing Events in IDH2 single-mutant AML Patients from the TCGA cohort. Supplementary Table 5 | Aberrant Splicing Events in SRSF2 single-mutant AML Patients from the TCGA cohort. Supplementary Table 6 | Aberrant Splicing Events in IDH2/SRSF2 Double-mutant AML Patients from the TCGA Cohort. Supplementary Table 7 | Aberrant Splicing Events in IDH2 Single-mutant AML Patients from the Beat-AML Cohort. Supplementary Table 8 | Aberrant Splicing Events in SRSF2 single-mutant AML Patients from the Beat-AML cohort. Supplementary Table 9 | Aberrant Splicing Events in IDH2/SRSF2 Double-mutant AML Patients from the Beat-AML Cohort. Supplementary Table 10 | Aberrant Splicing Events in IDH2 Single-mutant AML Patients from the Leucegene Cohort. Supplementary Table 11 | Aberrant Splicing Events in SRSF2 Single-mutant AML Patients from the Leucegene Cohort. Supplementary Table 12 | Aberrant Splicing Events in IDH2/SRSF2 Double-mutant AML Patients from the Leucegene Cohort. Supplementary Table 13 | Aberrant Splicing Events in IDH2 Single-mutant AML Patients from the Unpublished Collaborative Cohort_1. Supplementary Table 14 | Aberrant Splicing Events in IDH2/SRSF2 Double-mutant AML Patients from the Unpublished Collaborative Cohort_1 (compared to “Control” group). Supplementary Table 15 | Aberrant Splicing Events in IDH2/SRSF2 Double-mutant AML Patients from the Unpublished Collaborative Cohort_1 (compared to “IDH2 single-mutant” group). Supplementary Table 16 | Aberrant Splicing Events in IDH2 Single-mutant AML Patients from the Unpublished Collaborative Cohort_2. Supplementary Table 17 | Aberrant Splicing Events in SRSF2 Single-mutant AML Patients from the Unpublished Collaborative Cohort_2. Supplementary Table 18 | Aberrant Splicing Events in IDH2/SRSF2 Double-mutant AML Patients from the Unpublished Collaborative Cohort_2. Supplementary Table 19 | Characteristics of Manchester patients included in the Unpublished Collaborative Cohort_1 RNA-sequencing dataset. Supplementary Table 20 | Characteristics of Manchester patients included in the Unpublished Collaborative Cohort_2 RNA-sequencing dataset. Supplementary Table 21 | Aberrant Splicing Events in TET2/SRSF2 co-mutated AML Patients from the TCGA cohort. Supplementary Table 22 | Aberrant Splicing Events in TET2/SRSF2 co-mutated AML Patients from the Beat-AML cohort. Supplementary Table 23 | Characteristics of Patients whose samples were assayed for WB, eRRBS, and/or ChIP-seq. Supplementary Table 24 | List of aberrant splicing events affecting components of Integrator and SOSS complex in SRSF2 mutant AML. Supplementary Table 25 | Gene sets significantly changed upon INTS3 depletion in IDH2 mutant HL 60 cells. Supplementary Table 26 | Aberrant Splicing Events in IDH2 mutant Low-Grade Glioma Patients from the TCGA. Supplementary Table 27 | Aberrant Splicing Events in IDH1 mutant Low-Grade Glioma Patients from the TCGA. Supplementary Table 28 | List of IDH1/IDH2 mutant Low-Grade Glioma Patients from the TCGA.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found
            Is Open Access

            The role of alternative splicing in cancer: From oncogenesis to drug resistance

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              RNA Splicing and Cancer

                Bookmark

                Author and article information

                Contributors
                (View ORCID Profile)
                Journal
                EPMA Journal
                EPMA Journal
                Springer Science and Business Media LLC
                1878-5085
                June 2022
                May 10 2022
                June 2022
                : 13
                : 2
                : 335-350
                Article
                10.1007/s13167-022-00279-0
                35719132
                60a8b1b9-2ac5-4160-842d-eed363cb2300
                © 2022

                https://www.springer.com/tdm

                https://www.springer.com/tdm

                History

                Comments

                Comment on this article