12
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Evaluating the performance of dropout imputation and clustering methods for single-cell RNA sequencing data.

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Recent advances in single-cell RNA sequencing (scRNA-seq) provide exciting opportunities for transcriptome analysis at single-cell resolution. Clustering individual cells is a key step to reveal cell subtypes and infer cell lineage in scRNA-seq analysis. Although many dedicated algorithms have been proposed, clustering quality remains a computational challenge for scRNA-seq data, which is exacerbated by inflated zero counts due to various technical noise. To address this challenge, we assess the combinations of nine popular dropout imputation methods and eight clustering methods on a collection of 10 well-annotated scRNA-seq datasets with different sample sizes. Our results show that (i) imputation algorithms do typically improve the performance of clustering methods, and the quality of data visualization using t-Distributed Stochastic Neighbor Embedding; and (ii) the performance of a particular combination of imputation and clustering methods varies with dataset size. For example, the combination of single-cell analysis via expression recovery and Sparse Subspace Clustering (SSC) methods usually works well on smaller datasets, while the combination of adaptively-thresholded low-rank approximation and single-cell interpretation via multikernel learning (SIMLR) usually achieves the best performance on larger datasets.

          Related collections

          Author and article information

          Journal
          Comput Biol Med
          Computers in biology and medicine
          Elsevier BV
          1879-0534
          0010-4825
          Jul 2022
          : 146
          Affiliations
          [1 ] College of Computer Science and Electronic Engineering, Hunan University, Changsha, Hunan, 410082, China.
          [2 ] College of Life Science, Northeast Forestry University, Harbin, Heilongjiang, 150000, China.
          [3 ] School of Science, Dalian Maritime University, Dalian, Liaoning, 116026, China.
          [4 ] Academician Workstation, Changsha Medical University, Changsha, 410219, China.
          [5 ] Geneis Beijing Co., Ltd., Beijing, 100102, China; Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, 266000, China.
          [6 ] Department of Statistics and Data Science, Department of Mathematics, National University of Singapore, Singapore, 117546, Republic of Singapore.
          [7 ] School of Computing Sciences, University of East Anglia, Norwich, NR4 7TJ, UK.
          [8 ] School of Electrical & Information Engineering, Anhui University of Technology, Anhui, 243002, China. Electronic address: wangbing@ustc.edu.
          [9 ] Geneis Beijing Co., Ltd., Beijing, 100102, China; Qingdao Geneis Institute of Big Data Mining and Precision Medicine, Qingdao, 266000, China. Electronic address: yangjl@geneis.cn.
          Article
          S0010-4825(22)00480-2
          10.1016/j.compbiomed.2022.105697
          35697529
          d3808ef5-1bca-44f9-9de8-705085dd7289
          History

          Dropout imputation,Adjusted rand index,Cell clustering,Single-cell RNA sequencing,T-SNE

          Comments

          Comment on this article