Multi-domain translation between single-cell imaging and sequencing data using autoencoders – ScienceOpen
31
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Multi-domain translation between single-cell imaging and sequencing data using autoencoders

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The development of single-cell methods for capturing different data modalities including imaging and sequencing has revolutionized our ability to identify heterogeneous cell states. Different data modalities provide different perspectives on a population of cells, and their integration is critical for studying cellular heterogeneity and its function. While various methods have been proposed to integrate different sequencing data modalities, coupling imaging and sequencing has been an open challenge. We here present an approach for integrating vastly different modalities by learning a probabilistic coupling between the different data modalities using autoencoders to map to a shared latent space. We validate this approach by integrating single-cell RNA-seq and chromatin images to identify distinct subpopulations of human naive CD4+ T-cells that are poised for activation. Collectively, our approach provides a framework to integrate and translate between data modalities that cannot yet be measured within the same cell for diverse applications in biomedical discovery.

          Abstract

          Integration of single cell data modalities increases the richness of information about the heterogeneity of cell states, but integration of imaging and transcriptomics is an open challenge. Here the authors use autoencoders to learn a probabilistic coupling and map these modalities to a shared latent space.

          Related collections

          Most cited references25

          • Record: found
          • Abstract: found
          • Article: not found

          Deep learning.

          Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Comprehensive Integration of Single-Cell Data

            Single-cell transcriptomics has transformed our ability to characterize cell states, but deep biological understanding requires more than a taxonomic listing of clusters. As new methods arise to measure distinct cellular modalities, a key analytical challenge is to integrate these datasets to better understand cellular identity and function. Here, we develop a strategy to "anchor" diverse datasets together, enabling us to integrate single-cell measurements not only across scRNA-seq technologies, but also across different modalities. After demonstrating improvement over existing methods for integrating scRNA-seq data, we anchor scRNA-seq experiments with scATAC-seq to explore chromatin differences in closely related interneuron subsets and project protein expression measurements onto a bone marrow atlas to characterize lymphocyte populations. Lastly, we harmonize in situ gene expression and scRNA-seq datasets, allowing transcriptome-wide imputation of spatial gene expression patterns. Our work presents a strategy for the assembly of harmonized references and transfer of information across datasets.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Integrating single-cell transcriptomic data across different conditions, technologies, and species

              Computational single-cell RNA-seq (scRNA-seq) methods have been successfully applied to experiments representing a single condition, technology, or species to discover and define cellular phenotypes. However, identifying subpopulations of cells that are present across multiple data sets remains challenging. Here, we introduce an analytical strategy for integrating scRNA-seq data sets based on common sources of variation, enabling the identification of shared populations across data sets and downstream comparative analysis. We apply this approach, implemented in our R toolkit Seurat (http://satijalab.org/seurat/), to align scRNA-seq data sets of peripheral blood mononuclear cells under resting and stimulated conditions, hematopoietic progenitors sequenced using two profiling technologies, and pancreatic cell 'atlases' generated from human and mouse islets. In each case, we learn distinct or transitional cell states jointly across data sets, while boosting statistical power through integrated analysis. Our approach facilitates general comparisons of scRNA-seq data sets, potentially deepening our understanding of how distinct cell states respond to perturbation, disease, and evolution.
                Bookmark

                Author and article information

                Contributors
                cuhler@mit.edu
                Journal
                Nat Commun
                Nat Commun
                Nature Communications
                Nature Publishing Group UK (London )
                2041-1723
                4 January 2021
                4 January 2021
                2021
                : 12
                : 31
                Affiliations
                [1 ]GRID grid.116068.8, ISNI 0000 0001 2341 2786, Massachusetts Institute of Technology, ; Cambridge, MA USA
                [2 ]GRID grid.4280.e, ISNI 0000 0001 2180 6431, Mechanobiology Institute, , National University of Singapore, ; Singapore, Singapore
                [3 ]GRID grid.5991.4, ISNI 0000 0001 1090 7501, ETH Zurich and Paul Scherrer Institute, ; Villigen, Switzerland
                [4 ]GRID grid.7678.e, ISNI 0000 0004 1757 7797, FIRC Institute of Molecular Oncology (IFOM), ; Milano, Italy
                Author information
                http://orcid.org/0000-0002-7008-0216
                Article
                20249
                10.1038/s41467-020-20249-2
                7782789
                33397893
                26863999-dd80-4bbc-8ecb-42e39606e301
                © The Author(s) 2021

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 16 March 2020
                : 19 November 2020
                Funding
                Funded by: FundRef https://doi.org/10.13039/100000006, United States Department of Defense | United States Navy | Office of Naval Research (ONR);
                Award ID: N00014-17-1-2147
                Award ID: N00014-18-1-2765
                Award Recipient :
                Funded by: FundRef https://doi.org/10.13039/100000879, Alfred P. Sloan Foundation;
                Award ID: Sloan Fellowship
                Award Recipient :
                Funded by: FundRef https://doi.org/10.13039/100000893, Simons Foundation;
                Award ID: Simons Investigator Award
                Award Recipient :
                Funded by: FundRef https://doi.org/10.13039/501100001459, Ministry of Education - Singapore (MOE);
                Award ID: Tier-3 Grant
                Award Recipient :
                Funded by: FundRef https://doi.org/10.13039/100000001, National Science Foundation (NSF);
                Award ID: Graduate Research Fellowship
                Award Recipient :
                Funded by: FundRef https://doi.org/10.13039/100007065, Nvidia;
                Award ID: Totan XP
                Award Recipient :
                Categories
                Article
                Custom metadata
                © The Author(s) 2021

                Uncategorized
                data integration,machine learning,systems biology
                Uncategorized
                data integration, machine learning, systems biology

                Comments

                Comment on this article