58
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Integrative analysis of genome-wide loss of heterozygosity and monoallelic expression at nucleotide resolution reveals disrupted pathways in triple-negative breast cancer

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Loss of heterozygosity (LOH) and copy number alteration (CNA) feature prominently in the somatic genomic landscape of tumors. As such, karyotypic aberrations in cancer genomes have been studied extensively to discover novel oncogenes and tumor-suppressor genes. Advances in sequencing technology have enabled the cost-effective detection of tumor genome and transcriptome mutation events at single-base-pair resolution; however, computational methods for predicting segmental regions of LOH in this context are not yet fully explored. Consequently, whole transcriptome, nucleotide-level resolution analysis of monoallelic expression patterns associated with LOH has not yet been undertaken in cancer. We developed a novel approach for inference of LOH from paired tumor/normal sequence data and applied it to a cohort of 23 triple-negative breast cancer (TNBC) genomes. Following extensive benchmarking experiments, we describe the nucleotide-resolution landscape of LOH in TNBC and assess the consequent effect of LOH on the transcriptomes of these tumors using RNA-seq-derived measurements of allele-specific expression. We show that the majority of monoallelic expression in the transcriptomes of triple-negative breast cancer can be explained by genomic regions of LOH and establish an upper bound for monoallelic expression that may be explained by other tumor-specific modifications such as epigenetics or mutations. Monoallelically expressed genes associated with LOH reveal that cell cycle, homologous recombination and actin-cytoskeletal functions are putatively disrupted by LOH in TNBC. Finally, we show how inference of LOH can be used to interpret allele frequencies of somatic mutations and postulate on temporal ordering of mutations in the evolutionary history of these tumors.

          Related collections

          Most cited references47

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          The Sequence Alignment/Map format and SAMtools

          Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: http://samtools.sourceforge.net Contact: rd@sanger.ac.uk
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Fast and accurate short read alignment with Burrows–Wheeler transform

            Motivation: The enormous amount of short reads generated by the new DNA sequencing technologies call for the development of fast and accurate read alignment programs. A first generation of hash table-based methods has been developed, including MAQ, which is accurate, feature rich and fast enough to align short reads from a single individual. However, MAQ does not support gapped alignment for single-end reads, which makes it unsuitable for alignment of longer reads where indels may occur frequently. The speed of MAQ is also a concern when the alignment is scaled up to the resequencing of hundreds of individuals. Results: We implemented Burrows-Wheeler Alignment tool (BWA), a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps. BWA supports both base space reads, e.g. from Illumina sequencing machines, and color space reads from AB SOLiD machines. Evaluations on both simulated and real data suggest that BWA is ∼10–20× faster than MAQ, while achieving similar accuracy. In addition, BWA outputs alignment in the new standard SAM (Sequence Alignment/Map) format. Variant calling and other downstream analyses after the alignment can be achieved with the open source SAMtools software package. Availability: http://maq.sourceforge.net Contact: rd@sanger.ac.uk
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data.

              Next-generation DNA sequencing (NGS) projects, such as the 1000 Genomes Project, are already revolutionizing our understanding of genetic variation among individuals. However, the massive data sets generated by NGS--the 1000 Genome pilot alone includes nearly five terabases--make writing feature-rich, efficient, and robust analysis tools difficult for even computationally sophisticated individuals. Indeed, many professionals are limited in the scope and the ease with which they can answer scientific questions by the complexity of accessing and manipulating the data produced by these machines. Here, we discuss our Genome Analysis Toolkit (GATK), a structured programming framework designed to ease the development of efficient and robust analysis tools for next-generation DNA sequencers using the functional programming philosophy of MapReduce. The GATK provides a small but rich set of data access patterns that encompass the majority of analysis tool needs. Separating specific analysis calculations from common data management infrastructure enables us to optimize the GATK framework for correctness, stability, and CPU and memory efficiency and to enable distributed and shared memory parallelization. We highlight the capabilities of the GATK by describing the implementation and application of robust, scale-tolerant tools like coverage calculators and single nucleotide polymorphism (SNP) calling. We conclude that the GATK programming framework enables developers and analysts to quickly and easily write efficient and robust NGS tools, many of which have already been incorporated into large-scale sequencing projects like the 1000 Genomes Project and The Cancer Genome Atlas.
                Bookmark

                Author and article information

                Journal
                Genome Res
                Genome Res
                GENOME
                Genome Research
                Cold Spring Harbor Laboratory Press
                1088-9051
                1549-5469
                October 2012
                October 2012
                : 22
                : 10
                : 1995-2007
                Affiliations
                [1 ]Department of Molecular Oncology, British Columbia Cancer Agency, Vancouver, British Columbia V5Z 1L3, Canada;
                [2 ]Bioinformatics Training Program, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada;
                [3 ]Department of Computer Science, University of British Columbia, Vancouver, British Columbia V6T 1Z4, Canada;
                [4 ]Genome Sciences Centre, British Columbia Cancer Agency, Vancouver, British Columbia V5Z 1L3, Canada;
                [5 ]Cancer Research UK, Cambridge Research Institute, Li Ka Shing Centre, Cambridge CB2 0RE, United Kingdom;
                [6 ]Department of Pathology and Laboratory Medicine, University of British Columbia, Vancouver, British Columbia V6T 2B5, Canada
                Author notes
                [7 ]Corresponding author E-mail sshah@ 123456bccrc.ca
                Article
                9518021
                10.1101/gr.137570.112
                3460194
                22637570
                31876f2e-8fec-4f4e-ba6f-b76b8ceb1e0d
                © 2012, Published by Cold Spring Harbor Laboratory Press

                This article is distributed exclusively by Cold Spring Harbor Laboratory Press for the first six months after the full-issue publication date (see http://genome.cshlp.org/site/misc/terms.xhtml). After six months, it is available under a Creative Commons License (Attribution-NonCommercial 3.0 Unported License), as described at http://creativecommons.org/licenses/by-nc/3.0/.

                History
                : 11 January 2012
                : 22 May 2012
                Categories
                Method

                Comments

                Comment on this article