Introduction Bacterial pathogens have long been recognized to undergo phenotypic variation (reviewed in [1]). Historically, interest in this phenomenon has been fueled by the observation that phenotypic variants can differ in pathogenesis characteristics, such as increased or decreased virulence, or adaptation to a particular anatomic site. Extensive work has been directed at elucidating the molecular genetic events that contribute to phenotypic variation, with antigenic variation being the best-studied category. With few exceptions, most studies have focused on analysis of a distinct phenotype such as adhesin production or lipooligosaccharide structural modification. Several molecular mechanisms have been documented to contribute to phenotypic variation, the most common being slipped-strand mispairing events that result in phase-variable expression of the associated gene [1]. The group A streptococci (GAS) cause many distinct human infections [2]. Disease manifestations range from mild infections such as pharyngitis (“strep throat”) and impetigo, to extensive tissue destruction in the case of necrotizing fasciitis (the “flesh-eating” syndrome). Postinfection sequelae such as rheumatic fever and glomerulonephritis can also occur. The mechanisms that enable GAS to cause diverse diseases are unknown, although both bacterial and host-specific components are thought to be involved [3]. Associated morphologic and virulence variation in GAS has been known for almost 90 y [4,5]. Classic studies identified GAS phenotypic variation during invasive and upper respiratory tract infections [4,6]. More recently, correlations have been reported between the source of GAS clinical isolates and their ability to invade human epithelial cells or secrete high concentrations of virulence factors such as streptococcal pyrogenic exotoxin A, B, and C (SpeA, SpeB, and SpeC), or streptolysin O (SLO) [7–9]. Such correlations have been observed for multiple GAS serotypes, including clonal contemporary serotype M1 GAS [10]. The idea that GAS phenotypic heterogeneity contributes to distinct disease manifestations is supported by the identification of inherited alterations in virulence factor production when GAS is passaged in human blood ex vivo or through mice [5,11–14]. Virulence factor production by GAS is regulated by stand-alone transcription factors and two-component signal transduction systems (TCSs) [15]. Thirteen TCSs have been described in GAS, of which the CovRS system (also known as CsrRS) is the best characterized. CovRS is a negative regulatory TCS that directly or indirectly influences expression of 10% to 15% of GAS genes, including several virulence factors [16–21]. Despite these advances, we have an imprecise understanding of the contribution of phenotypic variation to host–pathogen interactions in GAS, and the molecular mechanism(s) controlling this heterogeneity. Recently, genome-wide investigative strategies have been used successfully to provide new information about GAS population genetics, evolution, and pathogenesis [22]. Inasmuch as phenotypic variation in GAS may be a key component of the pathogen life cycle, we chose to investigate this phenomenon using genome-wide analytic strategies, including transcriptome profiling and genome resequencing. Here we report genome, transcriptome, and partial secretome differences that distinguish GAS isolated from invasive and pharyngeal infections and permit a heretofore unattainable understanding of phenotypic variation in a microbial pathogen. Results Transcriptome-Based Grouping of Serotype M1 GAS Strains The transcriptomes of nine contemporary (post-1987) serotype M1 GAS strains grown to early exponential phase in Todd-Hewitt broth with yeast extract (THY) were analyzed with an Affymetrix expression microarray. These nine strains included six from patients with pharyngitis and three from invasive disease episodes and were selected from approximately 2,000 genetically characterized serotype M1 strains [10]. Two very distinct transcriptome clusters were identified based on analysis of the microarray data (Figure 1A). The three invasive isolates formed one cluster termed an invasive transcriptome profile (ITP), and the six pharyngitis isolates formed a second cluster termed a pharyngeal transcriptome profile (PTP). The data imply that GAS strains cultured from patients with pharyngeal and invasive disease have distinct transcriptomes, which are retained upon in vitro growth. Analysis of differential gene expression between the two transcriptome profiles identified 89 genes that were statistically significant (t-test followed by a false discovery rate correction, Q 2-fold by ITP strains are colored red. Virulence factors/regulators transcribed >2-fold by PTP strains are colored blue. The emm gene, encoding the important virulence factor M protein, is highlighted yellow for reference. (155 KB PPT) Click here for additional data file. Figure S2 Schematic of Experiment Leading to Isolation of Mouse-Passaged GAS Derivatives PTP GAS (blue box, nonmucoid) or ITP GAS (red box, mucoid) were injected subcutaneously into mice. Five days after infection mice were euthanized and GAS isolated from spleens and skin lesions. ITP GAS were isolated from the spleens and skin lesions of all infected mice. GAS recovered from skin lesions of mice infected with PTP GAS had an approximately 1:1 ratio of ITP to PTP GAS. (9.2 MB PPT) Click here for additional data file. Figure S3 ITP Strains Secrete Increased NADase Activity Compared to PTP Strains NADase titers are shown on the y-axis, with different GAS strains shown on the x-axis. Color coding is as described for Figure 2B. The experiment was performed in duplicate and results identical to those shown were obtained on both occasions. NEG, negative controls. (29 KB PPT) Click here for additional data file. Figure S4 Correlation of Microarray Data between ITP/PTP GAS Isolated from Clinical Sources and following Mouse Passage The fold change in transcript levels (ITP relative to PTP) of 24 virulence-related genes from the clinical GAS microarray (Figure 1) and the mouse-passaged GAS microarray (Figure 2) were log-transformed and plotted against each other to evaluate their correlation. (44 KB PPT) Click here for additional data file. Protocol S1 Comparative Genomic Resequencing (27 KB DOC) Click here for additional data file. Table S1 Serotype M1 Group A Streptococcus Isolates Studied (95 KB DOC) Click here for additional data file. Accession Numbers Expression microarray data have been deposited at the Gene Expression Omnibus database at National Center for Biotechnology Information (http://www.ncbi.nlm.nih.gov/geo) and are accessible through accession numbers GSE3899 and GSE3900. The GenBank (http://www.ncbi.nlm.nih.gov) accession number for the whole genome sequence of strain MGAS5005 is CP000017.