32
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Power Analysis of Single Cell RNA-Sequencing Experiments

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Single cell RNA sequencing (scRNA-seq) has become an established and powerful method to investigate transcriptomic cell-to-cell variation, revealing new cell types, and providing insights into developmental processes and transcriptional stochasticity. The array of published scRNA-seq protocols allow one to sequence transcriptomes from minute amounts of starting material. A key question is how these various protocols compare in terms of sensitivity of detection of mRNA molecules, and accuracy of quantification of expression. Here, we present an assessment of sensitivity and accuracy of many published data sets by spike-in standards with uniform data processing, including development of a flexible Unique Molecular Identifier (UMI) counting tool ( https://github.com/vals/umis). We computationally compare 15 protocols, and experimentally assess 4 protocols on batch-matched cell populations, as well as investigating the impact of spike-in molecule degradation on two types of spike-ins. Our analysis provides an integrated framework for comparing different scRNA-seq protocols.

          Related collections

          Most cited references22

          • Record: found
          • Abstract: found
          • Article: not found

          Quantitative single-cell RNA-seq with unique molecular identifiers.

          Single-cell RNA sequencing (RNA-seq) is a powerful tool to reveal cellular heterogeneity, discover new cell types and characterize tumor microevolution. However, losses in cDNA synthesis and bias in cDNA amplification lead to severe quantitative errors. We show that molecular labels--random sequences that label individual molecules--can nearly eliminate amplification noise, and that microfluidic sample preparation and optimized reagents produce a fivefold improvement in mRNA capture efficiency.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Synthetic spike-in standards for RNA-seq experiments.

            High-throughput sequencing of cDNA (RNA-seq) is a widely deployed transcriptome profiling and annotation technique, but questions about the performance of different protocols and platforms remain. We used a newly developed pool of 96 synthetic RNAs with various lengths, and GC content covering a 2(20) concentration range as spike-in controls to measure sensitivity, accuracy, and biases in RNA-seq experiments as well as to derive standard curves for quantifying the abundance of transcripts. We observed linearity between read density and RNA input over the entire detection range and excellent agreement between replicates, but we observed significantly larger imprecision than expected under pure Poisson sampling errors. We use the control RNAs to directly measure reproducible protocol-dependent biases due to GC content and transcript length as well as stereotypic heterogeneity in coverage across transcripts correlated with position relative to RNA termini and priming sequence bias. These effects lead to biased quantification for short transcripts and individual exons, which is a serious problem for measurements of isoform abundances, but that can partially be corrected using appropriate models of bias. By using the control RNAs, we derive limits for the discovery and detection of rare transcripts in RNA-seq experiments. By using data collected as part of the model organism and human Encyclopedia of DNA Elements projects (ENCODE and modENCODE), we demonstrate that external RNA controls are a useful resource for evaluating sensitivity and accuracy of RNA-seq experiments for transcriptome discovery and quantification. These quality metrics facilitate comparable analysis across different samples, protocols, and platforms.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found
              Is Open Access

              Computational assignment of cell-cycle stage from single-cell transcriptome data.

              The transcriptome of single cells can reveal important information about cellular states and heterogeneity within populations of cells. Recently, single-cell RNA-sequencing has facilitated expression profiling of large numbers of single cells in parallel. To fully exploit these data, it is critical that suitable computational approaches are developed. One key challenge, especially pertinent when considering dividing populations of cells, is to understand the cell-cycle stage of each captured cell. Here we describe and compare five established supervised machine learning methods and a custom-built predictor for allocating cells to their cell-cycle stage on the basis of their transcriptome. In particular, we assess the impact of different normalisation strategies and the usage of prior knowledge on the predictive power of the classifiers. We tested the methods on previously published datasets and found that a PCA-based approach and the custom predictor performed best. Moreover, our analysis shows that the performance depends strongly on normalisation and the usage of prior knowledge. Only by leveraging prior knowledge in form of cell-cycle annotated genes and by preprocessing the data using a rank-based normalisation, is it possible to robustly capture the transcriptional cell-cycle signature across different cell types, organisms and experimental protocols.
                Bookmark

                Author and article information

                Journal
                101215604
                32338
                Nat Methods
                Nat. Methods
                Nature methods
                1548-7091
                1548-7105
                20 February 2017
                06 March 2017
                April 2017
                06 September 2017
                : 14
                : 4
                : 381-387
                Affiliations
                [1 ]European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
                [2 ]Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, UK
                [3 ]Wellcome Trust – Medical Research Council Cambridge Stem Cell Institute, Cambridge, UK
                [4 ]Department of Haematology, University of Cambridge, Cambridge, UK
                [5 ]Centre of Biological Engineering, University of Minho, Campus de Gualtar, Braga, Portugal
                Author notes
                Correspondence should be addressed to VS ( vale@ 123456ebi.ac.uk ) or SAT ( st9@ 123456sanger.ac.uk ).
                Article
                EMS71488
                10.1038/nmeth.4220
                5376499
                28263961
                7ca92d88-53af-48a7-8f03-7135aecba913

                Users may view, print, copy, and download text and data-mine the content in such documents, for the purposes of academic research, subject always to the full Conditions of use: http://www.nature.com/authors/editorial_policies/license.html#terms

                History
                Categories
                Article

                Life sciences
                Life sciences

                Comments

                Comment on this article