Editorial: Integrative analysis of single-cell and/or bulk multi-omics sequencing data

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Introduction Each type of omics data including genomics, epigenomics, transcriptomics, proteomics, metabolomics, and metagenomics mainly provides the profile of one particular layer for a cell or sample (Hasin et al., 2017). Integrative analysis of multi-omics data could enable a more comprehensive dissection from different perspectives, which may facilitate a better and deeper understanding of the underlying molecular functions and mechanisms (Li et al., 2021). With the innovation and development of sequencing technologies, various single-cell and bulk profiling technologies have been developed and applied to a diversity of biological and clinical research (Lei et al., 2021; Li et al., 2021; Jiang et al., 2022). Bulk sequencing approaches allow the elucidation of each sample at the cell-population level, providing the averaged profile of a multitude of cells. By contrast, single-cell sequencing methods can interrogate thousands of cells at single-cell resolution for a given sample simultaneously. Joint analysis of multi-omics data generated from bulk and single-cell sequencing protocols could effectively facilitate the translation of basic science to practical applications (Stuart and Satija, 2019; Leng et al., 2022). On the other hand, the sample/cell scale and data size are growing rapidly in biomedical investigation. Thus, novel bioinformatics approaches are also in urgent need to more efficiently and robustly integrate distinct types of omics data. Since multi-omics strategies could be more powerful than single omics, combining different types of single-cell or bulk sequencing data for a more comprehensive exploration has become increasingly popular and important (Figure 1). In this Research Topic on Integrative Analysis of Single-Cell and/or Bulk Multi-omics Sequencing Data, we planned to collect novel findings and methods related to analyzing bulk and single-cell multi-omics or multimodal data with a systematic strategy. In total, 12 original research articles and one case report were published in this Research Topic, covering multi-omics-based cancer dissection, comparison of different data integration methods, and database construction for expression examination in various tissues. Here we concisely summarize and discuss the main results revealed in these studies. FIGURE 1 Overview of integrative analysis of multi-omics data generated from different bulk and single-cell sequencing technologies. Studies published in this research topic Guo et al. found that the mutations in TP53 and KRAS were significantly associated with the poor prognosis of intrahepatic cholangiocarcinoma (ICC). They further classified the ICC patients into different subgroups based on the mutation feature of TP53 and KRAS, which could benefit the clinical management of ICC. Johann et al. uncovered that the mutations of AKT1 and TP53 signaling pathways were closely associated with the pulmonary sclerosing pneumocytoma (PSP) through integrative analysis of genomic, transcriptomic, radiomic, and pathomic data. The insights into the underlying etiology and molecular behavior of PSP gained in this study may benefit corresponding therapy. Gao et al. constructed an effective prognostic model for breast cancer using the differentially expressed genes among distinct glycosylation patterns. Their results highlight the value and importance of risk score characterization based on glycosylation patterns for predicting the overall survival and immune infiltration of breast cancer patients. Hao et al. identified two subgroups of MYC signaling inhibition and activation for lung adenocarcinoma (LUAD) through joint analysis of genomics, transcriptomics, and single-cell sequencing data from multiple cohorts. The two LUAD subgroups discovered by them exhibited significant differences in terms of prognosis, genomic variations, immune microenvironment, as well as clinical features. Additionally, Jiang et al. built and validated a model for predicting the prognosis of LUAD by integrating bulk and single-cell RNA-seq data. They also detected two distinct subtypes of LUAD patients that differed in prognosis and immune characteristics. Sun et al. systematically analyzed the transcriptome of synovial sarcoma in terms of gene expression, alternative splicing, gene fusion, and circular RNAs. Their integrative analysis provided new insights into the transcriptomic profile and the underlying molecular mechanism of synovial sarcoma. Wang et al. constructed a clinical diagnostic map and a cluster prediction model for glioblastoma based on the methylation, expression, and single-cell sequencing data. The classification method developed by them could potentially promote the analysis of methylation heterogeneity for the promoter CpG islands in glioblastoma. Zhao et al. revealed high cellular heterogeneity in both malignant and immune cells of diffuse large B-cell lymphoma (DLBCL). They provided novel insights into the transcriptional dynamics of the tumor microenvironment for DLBCL. Zhang et al. established a prognostic model based on eight genes (DEFB1, AICDA, TYK2, CCR7, SCARB1, ULBP2, STC2, and LGR5) for predicting the overall survival of head and neck squamous cell carcinoma (HNSCC) patients. The low-risk and high-risk groups of HNSCC separately showed higher and lower immune scores, thus those eight gene signatures have the potential to be used in the clinical management of HNSCC. Li et al. classified hepatocellular carcinoma patients into high-necroptosis and low-necroptosis groups, which had a significant difference in survival time. They found that the high-necroptosis patients were with an enriched expression of immune checkpoint-related genes and could benefit from certain immunotherapy. Wang et al. uncovered four dysregulated oncogenic signaling pathways and identified related potential prognostic biomarkers for pan-cancer through systematic analysis of the TCGA multi-omics data. Their results could facilitate a better understanding of the function of oncogenic signaling pathways in human pan-cancer. Kujawa et al. systematically evaluated the influence of six different data integration methods on single-cell analysis. They found that ComBat-seq (Zhang et al., 2020), limma (Leek et al., 2012), and MNN (Haghverdi et al., 2018) could effectively reduce batch effects and preserve the differences between distinct biological conditions. Deng et al. constructed a gene expression omnibus database named ECO (https://heomics.shinyapps.io/ecodb/) for mouse endothelial cells based on the sequencing data of 203 samples from 71 different conditions. ECO could enable researchers to friendly explore endothelial expression profiles of diverse tissues in conditions of certain genetic modifications, disease models, and other stimulations in vivo. Summary and perspectives The studies published on this Research Topic discovered meaningful results and offered new insights into corresponding biomedical research. As we all know that the cost of sequencing technologies is gradually decreasing, which can facilitate the conduction of multi-omics investigations. Bulk and single-cell protocols have their own advantages and limitations. Compared to single-cell sequencing methods, bulk approaches do not need living cells and the experimental procedures are usually simpler (Li et al., 2021). Dissecting large-scale samples is more affordable for bulk strategies, but bulk data can not effectively provide cellular heterogeneity information. Single-cell sequencing allows a better understanding of cell-to-cell variations and molecular dynamics at single-cell resolution. However, existing single-cell technologies for generating different types of omics data still suffer lower capture efficiency and higher technical noise compared to traditional bulk protocols (Mustachio and Roszik, 2022; Wen and Tang, 2022). Therefore, bulk and single-cell approaches are complementary, the combination of bulk and single-cell data is valuable for getting both cell-population and single-cell level perspectives (Li et al., 2021). For example, the proportion of cell subtypes for large-scale bulk data could be deconvoluted with the cell-type-specific signatures inferred from the single-cell data of a small number of samples (Aibar et al., 2017; Wang et al., 2019; Zaitsev et al., 2019; Decamps et al., 2020; Lin et al., 2022). The biomarkers identified in single-cell sequencing data can be further correlated to the outcomes of patients to assess their potential clinical value using corresponding bulk data from public databases such as The Cancer Genome Atlas (TCGA) (Weinstein et al., 2013). Collectively, joint analysis of bulk and single-cell multi-omics data can help us gain a more comprehensive and systematic view of biological and clinical samples. The innovation of various omics profiling technologies and related machine learning methods for integrating different types of data will further make multi-omics exploration more feasible and easier. We hope the studies published on this Research Topic will inspire related biomedical researchers to better understand the benefit and value of multi-omics strategies.

Related collections

Most cited references 17

Record: found
Abstract: found
Article: not found

Is Open Access

The Cancer Genome Atlas Pan-Cancer analysis project.

John N Weinstein, Eric Collisson, Gordon Mills … (2014)

The Cancer Genome Atlas (TCGA) Research Network has profiled and analyzed large numbers of human tumors to discover molecular aberrations at the DNA, RNA, protein and epigenetic levels. The resulting rich data provide a major opportunity to develop an integrated picture of commonalities, differences and emergent themes across tumor lineages. The Pan-Cancer initiative compares the first 12 tumor types profiled by TCGA. Analysis of the molecular aberrations and their functional roles across tumor types will teach us how to extend therapies effective in one cancer type to others with a similar genomic profile.

0 comments Cited 1913 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

The sva package for removing batch effects and other unwanted variation in high-throughput experiments.

Jeffrey Leek, W Johnson, Hilary Parker … (2012)

Heterogeneity and latent variables are now widely recognized as major sources of bias and variability in high-throughput experiments. The most well-known source of latent variation in genomic experiments are batch effects-when samples are processed on different days, in different groups or by different people. However, there are also a large number of other variables that may have a major impact on high-throughput measurements. Here we describe the sva package for identifying, estimating and removing unwanted sources of variation in high-throughput experiments. The sva package supports surrogate variable estimation with the sva function, direct adjustment for known batch effects with the ComBat function and adjustment for batch and latent variables in prediction problems with the fsva function.

0 comments Cited 1867 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

SCENIC: Single-cell regulatory network inference and clustering

Sara Aibar, Carmen Bravo Gonzalez-Blas, Thomas Moerman … (2017)

Although single-cell RNA-seq is revolutionizing biology, data interpretation remains a challenge. We present SCENIC for the simultaneous reconstruction of gene regulatory networks and identification of cell states. We apply SCENIC to a compendium of single-cell data from tumors and brain, and demonstrate that the genomic regulatory code can be exploited to guide the identification of transcription factors and cell states. SCENIC provides critical biological insights into the mechanisms driving cellular heterogeneity.

0 comments Cited 1753 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Geng Chen: URI : https://loop.frontiersin.org/people/652880/overview

Rongshan Yu: URI : https://loop.frontiersin.org/people/661994/overview

Xingdong Chen: URI : https://loop.frontiersin.org/people/651437/overview

Journal

Journal ID (nlm-ta): Front Genet

Journal ID (iso-abbrev): Front Genet

Journal ID (publisher-id): Front. Genet.

Title: Frontiers in Genetics

Publisher: Frontiers Media S.A.

ISSN (Electronic): 1664-8021

Publication date (Electronic): 04 January 2023

Publication date Collection: 2022

Volume: 13

Electronic Location Identifier: 1121999

Affiliations

[1] ¹ Stemirna Therapeutics Co., Ltd. , Shanghai, China

[2] ² Department of Computer Science, School of Informatics, Xiamen University , Xiamen, China

[3] ³ State Key Laboratory of Genetic Engineering, Human Phenome Institute, School of Life Sciences, Fudan University , Shanghai, China

Author notes

Edited and reviewed by: Quan Zou, University of Electronic Science and Technology of China, China

*Correspondence: Geng Chen, chengeng66666@ 123456outlook.com

This article was submitted to Computational Genomics, a section of the journal Frontiers in Genetics

Article

Publisher ID: 1121999

DOI: 10.3389/fgene.2022.1121999

PMC ID: 9845394

SO-VID: 81529d6f-386f-4795-bca1-0ccdd5ba26e2

License:

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

Editorial: Integrative analysis of single-cell and/or bulk multi-omics sequencing data

Read this article at

Abstract

Related collections

Genome Integrity

Most cited references 17

The Cancer Genome Atlas Pan-Cancer analysis project.

The sva package for removing batch effects and other unwanted variation in high-throughput experiments.

SCENIC: Single-cell regulatory network inference and clustering

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Categories

Comments

Comment on this article

Similar content 3,287

Cited by 1

Most referenced authors 4,893