8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Annotated Whole-Genome Multilocus Sequence Typing Schema for Scalable High-Resolution Typing of Streptococcus pyogenes

      1 , 1 , 1 , 1 , 1
      Journal of Clinical Microbiology
      American Society for Microbiology

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Streptococcus pyogenes is a major human pathogen with high genetic diversity, largely created by recombination and horizontal gene transfer, making it difficult to use single nucleotide polymorphism (SNP)-based genome-wide analyses for surveillance. Using a gene-by-gene approach on 208 complete genomes of S. pyogenes , a novel whole-genome multilocus sequence typing (wgMLST) schema was developed, comprising 3,044 target loci.

          ABSTRACT

          Streptococcus pyogenes is a major human pathogen with high genetic diversity, largely created by recombination and horizontal gene transfer, making it difficult to use single nucleotide polymorphism (SNP)-based genome-wide analyses for surveillance. Using a gene-by-gene approach on 208 complete genomes of S. pyogenes , a novel whole-genome multilocus sequence typing (wgMLST) schema was developed, comprising 3,044 target loci. The schema was used for core-genome MLST (cgMLST) analyses of previously published data sets and 265 newly sequenced draft genomes with other molecular and phenotypic typing data. Clustering based on cgMLST data supported the genetic heterogeneity of many emm types and correlated poorly with pulsed-field gel electrophoresis macrorestriction profiling, superantigen gene profiling, and MLST sequence type, highlighting the limitations of older typing methods. While 763 loci were present in all isolates of a data set representative of S. pyogenes genetic diversity, the proposed schema allows scalable cgMLST analysis, which can include more loci for an increased resolution when typing closely related isolates. The cgMLST and PopPUNK clusters were broadly consistent in this diverse population. The cgMLST analyses presented results comparable to those of SNP-based methods in the identification of two recently emerged sublineages of emm 1 and emm 89 and the clarification of the genetic relatedness among isolates recovered in outbreak contexts. The schema was thoroughly annotated and made publicly available on the chewie-NS online platform ( https://chewbbaca.online/species/1/schemas/1 ), providing a framework for high-resolution typing and analyzing the genetic variability of loci of particular biological interest.

          Related collections

          Most cited references49

          • Record: found
          • Abstract: found
          • Article: not found

          Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation

          The RefSeq project at the National Center for Biotechnology Information (NCBI) maintains and curates a publicly available database of annotated genomic, transcript, and protein sequence records (http://www.ncbi.nlm.nih.gov/refseq/). The RefSeq project leverages the data submitted to the International Nucleotide Sequence Database Collaboration (INSDC) against a combination of computation, manual curation, and collaboration to produce a standard set of stable, non-redundant reference sequences. The RefSeq project augments these reference sequences with current knowledge including publications, functional features and informative nomenclature. The database currently represents sequences from more than 55 000 organisms (>4800 viruses, >40 000 prokaryotes and >10 000 eukaryotes; RefSeq release 71), ranging from a single record to complete genomes. This paper summarizes the current status of the viral, prokaryotic, and eukaryotic branches of the RefSeq project, reports on improvements to data access and details efforts to further expand the taxonomic representation of the collection. We also highlight diverse functional curation initiatives that support multiple uses of RefSeq data including taxonomic validation, genome annotation, comparative genomics, and clinical testing. We summarize our approach to utilizing available RNA-Seq and other data types in our manual curation process for vertebrate, plant, and other species, and describe a new direction for prokaryotic genomes and protein name management.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The global burden of group A streptococcal diseases.

            The global burden of disease caused by group A streptococcus (GAS) is not known. We review recent population-based data to estimate the burden of GAS diseases and highlight deficiencies in the available data. We estimate that there are at least 517,000 deaths each year due to severe GAS diseases (eg, acute rheumatic fever, rheumatic heart disease, post-streptococcal glomerulonephritis, and invasive infections). The prevalence of severe GAS disease is at least 18.1 million cases, with 1.78 million new cases each year. The greatest burden is due to rheumatic heart disease, with a prevalence of at least 15.6 million cases, with 282,000 new cases and 233,000 deaths each year. The burden of invasive GAS diseases is unexpectedly high, with at least 663,000 new cases and 163,000 deaths each year. In addition, there are more than 111 million prevalent cases of GAS pyoderma, and over 616 million incident cases per year of GAS pharyngitis. Epidemiological data from developing countries for most diseases is poor. On a global scale, GAS is an important cause of morbidity and mortality. These data emphasise the need to reinforce current control strategies, develop new primary prevention strategies, and collect better data from developing countries.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              MLST revisited: the gene-by-gene approach to bacterial genomics.

              Multilocus sequence typing (MLST) was proposed in 1998 as a portable sequence-based method for identifying clonal relationships among bacteria. Today, in the whole-genome era of microbiology, the need for systematic, standardized descriptions of bacterial genotypic variation remains a priority. Here, to meet this need, we draw on the successes of MLST and 16S rRNA gene sequencing to propose a hierarchical gene-by-gene approach that reflects functional and evolutionary relationships and catalogues bacteria 'from domain to strain'. Our gene-based typing approach using online platforms such as the Bacterial Isolate Genome Sequence Database (BIGSdb) allows the scalable organization and analysis of whole-genome sequence data.
                Bookmark

                Author and article information

                Contributors
                Journal
                Journal of Clinical Microbiology
                J Clin Microbiol
                American Society for Microbiology
                0095-1137
                1098-660X
                June 15 2022
                June 15 2022
                : 60
                : 6
                Affiliations
                [1 ]Instituto de Microbiologia, Instituto de Microbiologia Molecular, Faculdade de Medicina, Universidade de Lisboa, Lisbon, Portugal
                Article
                10.1128/jcm.00315-22
                35531659
                98c35998-6674-419b-931b-aead348384b6
                © 2022

                https://doi.org/10.1128/ASMCopyrightv2

                https://journals.asm.org/non-commercial-tdm-license

                History

                Comments

                Comment on this article