3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A multi-view latent variable model reveals cellular heterogeneity in complex tissues for paired multimodal single-cell data

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Motivation

          Single-cell multimodal assays allow us to simultaneously measure two different molecular features of the same cell, enabling new insights into cellular heterogeneity, cell development and diseases. However, most existing methods suffer from inaccurate dimensionality reduction for the joint-modality data, hindering their discovery of novel or rare cell subpopulations.

          Results

          Here, we present VIMCCA, a computational framework based on variational-assisted multi-view canonical correlation analysis to integrate paired multimodal single-cell data. Our statistical model uses a common latent variable to interpret the common source of variances in two different data modalities. Our approach jointly learns an inference model and two modality-specific non-linear models by leveraging variational inference and deep learning. We perform VIMCCA and compare it with 10 existing state-of-the-art algorithms on four paired multi-modal datasets sequenced by different protocols. Results demonstrate that VIMCCA facilitates integrating various types of joint-modality data, thus leading to more reliable and accurate downstream analysis. VIMCCA improves our ability to identify novel or rare cell subtypes compared to existing widely used methods. Besides, it can also facilitate inferring cell lineage based on joint-modality profiles.

          Availability and implementation

          The VIMCCA algorithm has been implemented in our toolkit package scbean ( 0.5.0), and its code has been archived at https://github.com/jhu99/scbean under MIT license.

          Supplementary information

          Supplementary data are available at Bioinformatics online.

          Related collections

          Most cited references58

          • Record: found
          • Abstract: found
          • Article: not found

          Comprehensive Integration of Single-Cell Data

          Single-cell transcriptomics has transformed our ability to characterize cell states, but deep biological understanding requires more than a taxonomic listing of clusters. As new methods arise to measure distinct cellular modalities, a key analytical challenge is to integrate these datasets to better understand cellular identity and function. Here, we develop a strategy to "anchor" diverse datasets together, enabling us to integrate single-cell measurements not only across scRNA-seq technologies, but also across different modalities. After demonstrating improvement over existing methods for integrating scRNA-seq data, we anchor scRNA-seq experiments with scATAC-seq to explore chromatin differences in closely related interneuron subsets and project protein expression measurements onto a bone marrow atlas to characterize lymphocyte populations. Lastly, we harmonize in situ gene expression and scRNA-seq datasets, allowing transcriptome-wide imputation of spatial gene expression patterns. Our work presents a strategy for the assembly of harmonized references and transfer of information across datasets.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Integrated analysis of multimodal single-cell data

            Summary The simultaneous measurement of multiple modalities represents an exciting frontier for single-cell genomics and necessitates computational methods that can define cellular states based on multimodal data. Here, we introduce “weighted-nearest neighbor” analysis, an unsupervised framework to learn the relative utility of each data type in each cell, enabling an integrative analysis of multiple modalities. We apply our procedure to a CITE-seq dataset of 211,000 human peripheral blood mononuclear cells (PBMCs) with panels extending to 228 antibodies to construct a multimodal reference atlas of the circulating immune system. Multimodal analysis substantially improves our ability to resolve cell states, allowing us to identify and validate previously unreported lymphoid subpopulations. Moreover, we demonstrate how to leverage this reference to rapidly map new datasets and to interpret immune responses to vaccination and coronavirus disease 2019 (COVID-19). Our approach represents a broadly applicable strategy to analyze single-cell multimodal datasets and to look beyond the transcriptome toward a unified and multimodal definition of cellular identity.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Large-scale simultaneous measurement of epitopes and transcriptomes in single cells

              Recent high-throughput single-cell sequencing approaches have been transformative for understanding complex cell populations, but are unable to provide additional phenotypic information, such as protein levels of cell-surface markers. Using oligonucleotide-labeled antibodies, we integrate measurements of cellular proteins and transcriptomes into an efficient, sequencing-based readout of single cells. This method is compatible with existing single-cell sequencing approaches and will readily scale as the throughput of these methods increase.
                Bookmark

                Author and article information

                Contributors
                Role: Associate Editor
                Journal
                Bioinformatics
                Bioinformatics
                bioinformatics
                Bioinformatics
                Oxford University Press
                1367-4803
                1367-4811
                January 2023
                09 January 2023
                09 January 2023
                : 39
                : 1
                : btad005
                Affiliations
                School of Computer Science, Northwestern Polytechnical University , Shaanxi 710129, China
                School of Computer Science, Northwestern Polytechnical University , Shaanxi 710129, China
                School of Computer Science, Northwestern Polytechnical University , Shaanxi 710129, China
                School of Computer Science, Northwestern Polytechnical University , Shaanxi 710129, China
                Department of Biostatistics, School of Public Health, Peking University Health Science Center , Beijing 100191, China
                Department of Orthopaedics, Union Hospital, Tongji Medical College, Huazhong University of Science and Technology , Wuhan 430022, China
                Institut für Informatik, Freie Universität Berlin , 14195 Berlin, Germany
                School of Computer Science, Northwestern Polytechnical University , Shaanxi 710129, China
                School of Life Science, Northwestern Polytechnical University , Shaanxi 710072, China
                School of Computer Science, Northwestern Polytechnical University , Shaanxi 710129, China
                Author notes

                The authors wish it to be known that, in their opinion, the Yuwei Wang and Bin Lian should be regarded as Joint First Authors.

                To whom correspondence should be addressed. Email: jhu@ 123456nwpu.edu.cn
                Author information
                https://orcid.org/0000-0002-3351-8020
                Article
                btad005
                10.1093/bioinformatics/btad005
                9857983
                36622018
                be68c62d-2461-4864-bb2c-de67d34154b9
                © The Author(s) 2023. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 03 August 2022
                : 27 December 2022
                : 03 January 2023
                : 06 January 2023
                : 20 January 2023
                Page count
                Pages: 15
                Funding
                Funded by: National Natural Science Foundation of China, DOI 10.13039/501100001809;
                Award ID: 62072374
                Funded by: National Natural Science Foundation of China, DOI 10.13039/501100001809;
                Award ID: 61772426
                Funded by: Fundamental Research Funds for the Central Universities, DOI 10.13039/501100012226;
                Categories
                Original Paper
                Gene Expression
                AcademicSubjects/SCI01060

                Bioinformatics & Computational biology
                Bioinformatics & Computational biology

                Comments

                Comment on this article