49
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      CCAST: A Model-Based Gating Strategy to Isolate Homogeneous Subpopulations in a Heterogeneous Population of Single Cells

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          A model-based gating strategy is developed for sorting cells and analyzing populations of single cells. The strategy, named CCAST, for Clustering, Classification and Sorting Tree, identifies a gating strategy for isolating homogeneous subpopulations from a heterogeneous population of single cells using a data-derived decision tree representation that can be applied to cell sorting. Because CCAST does not rely on expert knowledge, it removes human bias and variability when determining the gating strategy. It combines any clustering algorithm with silhouette measures to identify underlying homogeneous subpopulations, then applies recursive partitioning techniques to generate a decision tree that defines the gating strategy. CCAST produces an optimal strategy for cell sorting by automating the selection of gating markers, the corresponding gating thresholds and gating sequence; all of these parameters are typically manually defined. Even though CCAST is optimized for cell sorting, it can be applied for the identification and analysis of homogeneous subpopulations among heterogeneous single cell data. We apply CCAST on single cell data from both breast cancer cell lines and normal human bone marrow. On the SUM159 breast cancer cell line data, CCAST indicates at least five distinct cell states based on two surface markers (CD24 and EPCAM) and provides a gating sorting strategy that produces more homogeneous subpopulations than previously reported. When applied to normal bone marrow data, CCAST reveals an efficient strategy for gating T-cells without prior knowledge of the major T-cell subtypes and the markers that best define them. On the normal bone marrow data, CCAST also reveals two major mature B-cell subtypes, namely CD123+ and CD123- cells, which were not revealed by manual gating but show distinct intracellular signaling responses. More generally, the CCAST framework could be used on other biological and non-biological high dimensional data types that are mixtures of unknown homogeneous subpopulations.

          Author Summary

          Sorting out homogenous subpopulations in a heterogeneous population of single cells enables downstream characterization of specific cell types, such as cell-type specific genomic profiling. This study proposes a data-driven gating strategy, CCAST, for sorting out homogeneous subpopulations from a heterogeneous population of single cells without relying on expert knowledge thereby removing human bias and variability. In a fully automated manner, CCAST identifies the relevant gating markers, gating hierarchy and partitions that isolate homogeneous cell subpopulations. CCAST is optimized for cell sorting but can be applied to the identification and analysis of homogeneous subpopulations. CCAST is shown to identify more homogeneous breast cancer subpopulations in SUM159 compared to prior sorting strategies. When applied to normal bone marrow single cell data, CCAST proposes an efficient strategy for gating out T-cells without relying on expert knowledge; on B-cells, it reveals a new characterization of mature B-cell subtypes not revealed by manual gating.

          Related collections

          Most cited references13

          • Record: found
          • Abstract: not found
          • Article: not found

          Interpreting flow cytometry data: a guide for the perplexed.

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Automated high-dimensional flow cytometric data analysis.

            Flow cytometric analysis allows rapid single cell interrogation of surface and intracellular determinants by measuring fluorescence intensity of fluorophore-conjugated reagents. The availability of new platforms, allowing detection of increasing numbers of cell surface markers, has challenged the traditional technique of identifying cell populations by manual gating and resulted in a growing need for the development of automated, high-dimensional analytical methods. We present a direct multivariate finite mixture modeling approach, using skew and heavy-tailed distributions, to address the complexities of flow cytometric analysis and to deal with high-dimensional cytometric data without the need for projection or transformation. We demonstrate its ability to detect rare populations, to model robustly in the presence of outliers and skew, and to perform the critical task of matching cell populations across samples that enables downstream analysis. This advance will facilitate the application of flow cytometry to new, complex biological and clinical problems.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Tumor heterogeneity and personalized medicine.

                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS Comput Biol
                PLoS Comput. Biol
                plos
                ploscomp
                PLoS Computational Biology
                Public Library of Science (San Francisco, USA )
                1553-734X
                1553-7358
                July 2014
                31 July 2014
                : 10
                : 7
                : e1003664
                Affiliations
                [1]Department of Radiology, Center for Cancer Systems Biology, Stanford University, Stanford, California, United States of America
                University of California, San Diego, United States of America
                Author notes

                The authors have declared that no competing interests exist.

                Conceived and designed the experiments: BA SKP. Performed the experiments: BA MTD SKP. Analyzed the data: BA SKP. Contributed reagents/materials/analysis tools: BA MTD XZ SKP. Wrote the paper: BA SKP. Designed the software used in the analysis: BA.

                Article
                PCOMPBIOL-D-13-01516
                10.1371/journal.pcbi.1003664
                4117418
                25078380
                2ee05687-1467-42af-996f-5f143a0a37f1
                Copyright @ 2014

                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 26 August 2013
                : 25 April 2014
                Page count
                Pages: 14
                Funding
                This work was funded by NCI U54 CA149145 grant and the Gary M. Glazer Gift Fund. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Biology and Life Sciences
                Computational Biology

                Quantitative & Systems biology
                Quantitative & Systems biology

                Comments

                Comment on this article