50
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      RNA-Seq analysis in MeV

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Summary: RNA-Seq is an exciting methodology that leverages the power of high-throughput sequencing to measure RNA transcript counts at an unprecedented accuracy. However, the data generated from this process are extremely large and biologist-friendly tools with which to analyze it are sorely lacking. MultiExperiment Viewer (MeV) is a Java-based desktop application that allows advanced analysis of gene expression data through an intuitive graphical user interface. Here, we report a significant enhancement to MeV that allows analysis of RNA-Seq data with these familiar, powerful tools. We also report the addition to MeV of several RNA-Seq-specific functions, addressing the differences in analysis requirements between this data type and traditional gene expression data. These tools include automatic conversion functions from raw count data to processed RPKM or FPKM values and differential expression detection and functional annotation enrichment detection based on published methods.

          Availability: MeV version 4.7 is written in Java and is freely available for download under the terms of the open-source Artistic License version 2.0. The website ( http://mev.tm4.org/) hosts a full user manual as well as a short quick-start guide suitable for new users.

          Contact: johnq@ 123456jimmy.harvard.edu

          Related collections

          Most cited references3

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Transcript length bias in RNA-seq data confounds systems biology

          Background Several recent studies have demonstrated the effectiveness of deep sequencing for transcriptome analysis (RNA-seq) in mammals. As RNA-seq becomes more affordable, whole genome transcriptional profiling is likely to become the platform of choice for species with good genomic sequences. As yet, a rigorous analysis methodology has not been developed and we are still in the stages of exploring the features of the data. Results We investigated the effect of transcript length bias in RNA-seq data using three different published data sets. For standard analyses using aggregated tag counts for each gene, the ability to call differentially expressed genes between samples is strongly associated with the length of the transcript. Conclusion Transcript length bias for calling differentially expressed genes is a general feature of current protocols for RNA-seq technology. This has implications for the ranking of differentially expressed genes, and in particular may introduce bias in gene set testing for pathway analysis and other multi-gene systems biology analyses. Reviewers This article was reviewed by Rohan Williams (nominated by Gavin Huttley), Nicole Cloonan (nominated by Mark Ragan) and James Bullard (nominated by Sandrine Dudoit).
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Using quality scores and longer reads improves accuracy of Solexa read mapping

            Background Second-generation sequencing has the potential to revolutionize genomics and impact all areas of biomedical science. New technologies will make re-sequencing widely available for such applications as identifying genome variations or interrogating the oligonucleotide content of a large sample (e.g. ChIP-sequencing). The increase in speed, sensitivity and availability of sequencing technology brings demand for advances in computational technology to perform associated analysis tasks. The Solexa/Illumina 1G sequencer can produce tens of millions of reads, ranging in length from ~25–50 nt, in a single experiment. Accurately mapping the reads back to a reference genome is a critical task in almost all applications. Two sources of information that are often ignored when mapping reads from the Solexa technology are the 3' ends of longer reads, which contain a much higher frequency of sequencing errors, and the base-call quality scores. Results To investigate whether these sources of information can be used to improve accuracy when mapping reads, we developed the RMAP tool, which can map reads having a wide range of lengths and allows base-call quality scores to determine which positions in each read are more important when mapping. We applied RMAP to analyze data re-sequenced from two human BAC regions for varying read lengths, and varying criteria for use of quality scores. RMAP is freely available for downloading at . Conclusion Our results indicate that significant gains in Solexa read mapping performance can be achieved by considering the information in 3' ends of longer reads, and appropriately using the base-call quality scores. The RMAP tool we have developed will enable researchers to effectively exploit this information in targeted re-sequencing projects.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              RSEQtools: a modular framework to analyze RNA-Seq data using compact, anonymized data summaries

              Summary: The advent of next-generation sequencing for functional genomics has given rise to quantities of sequence information that are often so large that they are difficult to handle. Moreover, sequence reads from a specific individual can contain sufficient information to potentially identify and genetically characterize that person, raising privacy concerns. In order to address these issues, we have developed the Mapped Read Format (MRF), a compact data summary format for both short and long read alignments that enables the anonymization of confidential sequence information, while allowing one to still carry out many functional genomics studies. We have developed a suite of tools (RSEQtools) that use this format for the analysis of RNA-Seq experiments. These tools consist of a set of modules that perform common tasks such as calculating gene expression values, generating signal tracks of mapped reads and segmenting that signal into actively transcribed regions. Moreover, the tools can readily be used to build customizable RNA-Seq workflows. In addition to the anonymization afforded by MRF, this format also facilitates the decoupling of the alignment of reads from downstream analyses. Availability and implementation: RSEQtools is implemented in C and the source code is available at http://rseqtools.gersteinlab.org/. Contact: lukas.habegger@yale.edu; mark.gerstein@yale.edu Supplementary information: Supplementary data are available at Bioinformatics online.
                Bookmark

                Author and article information

                Journal
                Bioinformatics
                bioinformatics
                bioinfo
                Bioinformatics
                Oxford University Press
                1367-4803
                1367-4811
                15 November 2011
                5 October 2011
                5 October 2011
                : 27
                : 22
                : 3209-3210
                Affiliations
                1Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA 02115, USA, 2Department of Statistics, University of Oxford, Oxford, UK, 3Department of Cancer Biology, Dana-Farber Cancer Institute and 4Department of Biostatistics, Harvard School of Public Health, Boston, MA 02115, USA
                Author notes
                * To whom correspondence should be addressed.

                Associate Editor: Ivo Hofacker

                Article
                btr490
                10.1093/bioinformatics/btr490
                3208390
                21976420
                95b10447-ea8d-45bd-ab1a-2ccb2e1df9c0
                © The Author(s) 2011. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License ( http://creativecommons.org/licenses/by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 17 May 2011
                : 20 July 2011
                : 15 August 2011
                Page count
                Pages: 2
                Categories
                Applications Note
                Gene Expression

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article