Multi-domain translation between single-cell imaging and sequencing data using autoencoders

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The development of single-cell methods for capturing different data modalities including imaging and sequencing has revolutionized our ability to identify heterogeneous cell states. Different data modalities provide different perspectives on a population of cells, and their integration is critical for studying cellular heterogeneity and its function. While various methods have been proposed to integrate different sequencing data modalities, coupling imaging and sequencing has been an open challenge. We here present an approach for integrating vastly different modalities by learning a probabilistic coupling between the different data modalities using autoencoders to map to a shared latent space. We validate this approach by integrating single-cell RNA-seq and chromatin images to identify distinct subpopulations of human naive CD4+ T-cells that are poised for activation. Collectively, our approach provides a framework to integrate and translate between data modalities that cannot yet be measured within the same cell for diverse applications in biomedical discovery.

Abstract

Integration of single cell data modalities increases the richness of information about the heterogeneity of cell states, but integration of imaging and transcriptomics is an open challenge. Here the authors use autoencoders to learn a probabilistic coupling and map these modalities to a shared latent space.

Related collections

Most cited references 25

Record: found
Abstract: found
Article: not found

Deep learning.

Yann LeCun, Yoshua Bengio, Geoffrey E Hinton (2015)

Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate structure in large data sets by using the backpropagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

0 comments Cited 9756 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Comprehensive Integration of Single-Cell Data

Tim Stuart, Andrew Butler, Paul Hoffman … (2019)

Single-cell transcriptomics has transformed our ability to characterize cell states, but deep biological understanding requires more than a taxonomic listing of clusters. As new methods arise to measure distinct cellular modalities, a key analytical challenge is to integrate these datasets to better understand cellular identity and function. Here, we develop a strategy to "anchor" diverse datasets together, enabling us to integrate single-cell measurements not only across scRNA-seq technologies, but also across different modalities. After demonstrating improvement over existing methods for integrating scRNA-seq data, we anchor scRNA-seq experiments with scATAC-seq to explore chromatin differences in closely related interneuron subsets and project protein expression measurements onto a bone marrow atlas to characterize lymphocyte populations. Lastly, we harmonize in situ gene expression and scRNA-seq datasets, allowing transcriptome-wide imputation of spatial gene expression patterns. Our work presents a strategy for the assembly of harmonized references and transfer of information across datasets.

0 comments Cited 6202 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Integrating single-cell transcriptomic data across different conditions, technologies, and species

Rahul Satija, Efthymia Papalexi, Peter Smibert … (2018)

Computational single-cell RNA-seq (scRNA-seq) methods have been successfully applied to experiments representing a single condition, technology, or species to discover and define cellular phenotypes. However, identifying subpopulations of cells that are present across multiple data sets remains challenging. Here, we introduce an analytical strategy for integrating scRNA-seq data sets based on common sources of variation, enabling the identification of shared populations across data sets and downstream comparative analysis. We apply this approach, implemented in our R toolkit Seurat (http://satijalab.org/seurat/), to align scRNA-seq data sets of peripheral blood mononuclear cells under resting and stimulated conditions, hematopoietic progenitors sequenced using two profiling technologies, and pancreatic cell 'atlases' generated from human and mouse islets. In each case, we learn distinct or transitional cell states jointly across data sets, while boosting statistical power through integrated analysis. Our approach facilitates general comparisons of scRNA-seq data sets, potentially deepening our understanding of how distinct cell states respond to perturbation, disease, and evolution.

0 comments Cited 4473 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Caroline Uhler:

ORCID: http://orcid.org/0000-0002-7008-0216

cuhler@mit.edu

Journal

Journal ID (nlm-ta): Nat Commun

Journal ID (iso-abbrev): Nat Commun

Title: Nature Communications

Publisher: Nature Publishing Group UK (London )

ISSN (Electronic): 2041-1723

Publication date (Electronic): 4 January 2021

Publication date PMC-release: 4 January 2021

Publication date Collection: 2021

Volume: 12

Electronic Location Identifier: 31

Affiliations

[1 ]GRID grid.116068.8, ISNI 0000 0001 2341 2786, Massachusetts Institute of Technology, ; Cambridge, MA USA

[2 ]GRID grid.4280.e, ISNI 0000 0001 2180 6431, Mechanobiology Institute, , National University of Singapore, ; Singapore, Singapore

[3 ]GRID grid.5991.4, ISNI 0000 0001 1090 7501, ETH Zurich and Paul Scherrer Institute, ; Villigen, Switzerland

[4 ]GRID grid.7678.e, ISNI 0000 0004 1757 7797, FIRC Institute of Molecular Oncology (IFOM), ; Milano, Italy

Author information

Caroline Uhler http://orcid.org/0000-0002-7008-0216

Article

Publisher ID: 20249

DOI: 10.1038/s41467-020-20249-2

PMC ID: 7782789

PubMed ID: 33397893

SO-VID: 26863999-dd80-4bbc-8ecb-42e39606e301

License:

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

History

Date received : 16 March 2020

Date accepted : 19 November 2020

Funding

Funded by: FundRef https://doi.org/10.13039/100000006, United States Department of Defense | United States Navy | Office of Naval Research (ONR);

Award ID: N00014-17-1-2147

Award ID: N00014-18-1-2765