Expanding the genetic toolbox for the obligate human pathogen <i>Streptococcus pyogenes</i>

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Genetic tools form the basis for the study of molecular mechanisms. Despite many recent advances in the field of genetic engineering in bacteria, genetic toolsets remain scarce for non-model organisms, such as the obligatory human pathogen Streptococcus pyogenes. To overcome this limitation and enable the straightforward investigation of gene functions in S. pyogenes, we have developed a comprehensive genetic toolset. By adapting and combining different tools previously applied in other Gram-positive bacteria, we have created new replicative and integrative plasmids for gene expression and genetic manipulation, constitutive and inducible promoters as well as fluorescence reporters for S. pyogenes. The new replicative plasmids feature low- and high-copy replicons combined with different resistance cassettes and a standardized multiple cloning site for rapid cloning procedures. We designed site-specific integrative plasmids and verified their integration by nanopore sequencing. To minimize the effect of plasmid integration on bacterial physiology, we screened publicly available RNA-sequencing datasets for transcriptionally silent sites. We validated this approach by designing the integrative plasmid pSpy0K6 targeting the transcriptionally silent gene SPy_1078. Analysis of the activity of different constitutive promoters indicated a wide variety of strengths, with the lactococcal promoter P ₂₃ showing the strongest activity and the synthetic promoter P _xylS2 showing the weakest activity. Further, we assessed the functionality of three inducible regulatory elements including a zinc- and an IPTG-inducible promoter as well as an erythromycin-inducible riboswitch that showed low-to-no background expression and high inducibility. Additionally, we demonstrated the applicability of two codon-optimized fluorescent proteins, mNeongreen and mKate2, as reporters in S. pyogenes. We therefore adapted the chemically defined medium called RPMI4Spy that showed reduced autofluorescence and enabled efficient signal detection in plate reader assays and fluorescence microscopy. Finally, we developed a plasmid-based system for genome engineering in S. pyogenes featuring the counterselection marker pheS*, which enabled the scarless deletion of the sagB gene. This new toolbox simplifies previously laborious genetic manipulation procedures and lays the foundation for new methodologies to study gene functions in S. pyogenes, leading to a better understanding of its virulence mechanisms and physiology.

Related collections

Most cited references 110

Record: found
Abstract: found
Article: found

Is Open Access

The Sequence Alignment/Map format and SAMtools

Heng Li, Bob Handsaker, Alec Wysoker … (2009)

Summary: The Sequence Alignment/Map (SAM) format is a generic alignment format for storing read alignments against reference sequences, supporting short and long reads (up to 128 Mbp) produced by different sequencing platforms. It is flexible in style, compact in size, efficient in random access and is the format in which alignments from the 1000 Genomes Project are released. SAMtools implements various utilities for post-processing alignments in the SAM format, such as indexing, variant caller and alignment viewer, and thus provides universal tools for processing read alignments. Availability: http://samtools.sourceforge.net Contact: rd@sanger.ac.uk

0 comments Cited 14444 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

The variant call format and VCFtools

Petr Danecek, Adam Auton, Gonçalo R Abecasis … (2011)

Summary: The variant call format (VCF) is a generic format for storing DNA polymorphism data such as SNPs, insertions, deletions and structural variants, together with rich annotations. VCF is usually stored in a compressed manner and can be indexed for fast data retrieval of variants from a range of positions on the reference genome. The format was developed for the 1000 Genomes Project, and has also been adopted by other projects such as UK10K, dbSNP and the NHLBI Exome Project. VCFtools is a software suite that implements various utilities for processing VCF files, including validation, merging, comparing and also provides a general Perl API. Availability: http://vcftools.sourceforge.net Contact: rd@sanger.ac.uk

0 comments Cited 3648 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Integrative Genomics Viewer

James Robinson, Helga Thorvaldsdóttir, Wendy Winckler … (2011)

To the Editor Rapid improvements in sequencing and array-based platforms are resulting in a flood of diverse genome-wide data, including data from exome and whole genome sequencing, epigenetic surveys, expression profiling of coding and non-coding RNAs, SNP and copy number profiling, and functional assays. Analysis of these large, diverse datasets holds the promise of a more comprehensive understanding of the genome and its relation to human disease. Experienced and knowledgeable human review is an essential component of this process, complementing computational approaches. This calls for efficient and intuitive visualization tools able to scale to very large datasets and to flexibly integrate multiple data types, including clinical data. However, the sheer volume and scope of data poses a significant challenge to the development of such tools. To address this challenge we developed the Integrative Genomics Viewer (IGV), a lightweight visualization tool that enables intuitive real-time exploration of diverse, large-scale genomic datasets on standard desktop computers. It supports flexible integration of a wide range of genomic data types including aligned sequence reads, mutations, copy number, RNAi screens, gene expression, methylation, and genomic annotations (Figure S1). The IGV makes use of efficient, multi-resolution file formats to enable real-time exploration of arbitrarily large datasets over all resolution scales, while consuming minimal resources on the client computer (see Supplementary Text). Navigation through a dataset is similar to Google Maps, allowing the user to zoom and pan seamlessly across the genome at any level of detail from whole-genome to base pair (Figure S2). Datasets can be loaded from local or remote sources, including cloud-based resources, enabling investigators to view their own genomic datasets alongside publicly available data from, for example, The Cancer Genome Atlas (TCGA) 1 , 1000 Genomes (www.1000genomes.org/), and ENCODE 2 (www.genome.gov/10005107) projects. In addition, IGV allows collaborators to load and share data locally or remotely over the Web. IGV supports concurrent visualization of diverse data types across hundreds, and up to thousands of samples, and correlation of these integrated datasets with clinical and phenotypic variables. A researcher can define arbitrary sample annotations and associate them with data tracks using a simple tab-delimited file format (see Supplementary Text). These might include, for example, sample identifier (used to link different types of data for the same patient or tissue sample), phenotype, outcome, cluster membership, or any other clinical or experimental label. Annotations are displayed as a heatmap but more importantly are used for grouping, sorting, filtering, and overlaying diverse data types to yield a comprehensive picture of the integrated dataset. This is illustrated in Figure 1, a view of copy number, expression, mutation, and clinical data from 202 glioblastoma samples from the TCGA project in a 3 kb region around the EGFR locus 1, 3 . The investigator first grouped samples by tumor subtype, then by data type (copy number and expression), and finally sorted them by median copy number over the EGFR locus. A shared sample identifier links the copy number and expression tracks, maintaining their relative sort order within the subtypes. Mutation data is overlaid on corresponding copy number and expression tracks, based on shared participant identifier annotations. Several trends in the data stand out, such as a strong correlation between copy number and expression and an overrepresentation of EGFR amplified samples in the Classical subtype. IGV’s scalable architecture makes it well suited for genome-wide exploration of next-generation sequencing (NGS) datasets, including both basic aligned read data as well as derived results, such as read coverage. NGS datasets can approach terabytes in size, so careful management of data is necessary to conserve compute resources and to prevent information overload. IGV varies the displayed level of detail according to resolution scale. At very wide views, such as the whole genome, IGV represents NGS data by a simple coverage plot. Coverage data is often useful for assessing overall quality and diagnosing technical issues in sequencing runs (Figure S3), as well as analysis of ChIP-Seq 4 and RNA-Seq 5 experiments (Figures S4 and S5). As the user zooms below the ~50 kb range, individual aligned reads become visible (Figure 2) and putative SNPs are highlighted as allele counts in the coverage plot. Alignment details for each read are available in popup windows (Figures S6 and S7). Zooming further, individual base mismatches become visible, highlighted by color and intensity according to base call and quality. At this level, the investigator may sort reads by base, quality, strand, sample and other attributes to assess the evidence of a variant. This type of visual inspection can be an efficient and powerful tool for variant call validation, eliminating many false positives and aiding in confirmation of true findings (Figures S6 and S7). Many sequencing protocols produce reads from both ends (“paired ends”) of genomic fragments of known size distribution. IGV uses this information to color-code paired ends if their insert sizes are larger than expected, fall on different chromosomes, or have unexpected pair orientations. Such pairs, when consistent across multiple reads, can be indicative of a genomic rearrangement. When coloring aberrant paired ends, each chromosome is assigned a unique color, so that intra- (same color) and inter- (different color) chromosomal events are readily distinguished (Figures 2 and S8). We note that misalignments, particularly in repeat regions, can also yield unexpected insert sizes, and can be diagnosed with the IGV (Figure S9). There are a number of stand-alone, desktop genome browsers available today 6 including Artemis 7 , EagleView 8 , MapView 9 , Tablet 10 , Savant 11 , Apollo 12 , and the Integrated Genome Browser 13 . Many of them have features that overlap with IGV, particularly for NGS sequence alignment and genome annotation viewing. The Integrated Genome Browser also supports viewing array-based data. See Supplementary Table 1 and Supplementary Text for more detail. IGV focuses on the emerging integrative nature of genomic studies, placing equal emphasis on array-based platforms, such as expression and copy-number arrays, next-generation sequencing, as well as clinical and other sample metadata. Indeed, an important and unique feature of IGV is the ability to view all these different data types together and to use the sample metadata to dynamically group, sort, and filter datasets (Figure 1 above). Another important characteristic of IGV is fast data loading and real-time pan and zoom – at all scales of genome resolution and all dataset sizes, including datasets comprising hundreds of samples. Finally, we have placed great emphasis on the ease of installation and use of IGV, with the goal of making both the viewing and sharing of their data accessible to non-informatics end users. IGV is open source software and freely available at http://www.broadinstitute.org/igv/, including full documentation on use of the software. Supplementary Material 1

0 comments Cited 3539 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Nina Lautenschläger: Role: Role: Role: Role: Role: Role: Role:

Katja Schmidt: Role: Role: Role:

Carolin Schiffer: URI : https://loop.frontiersin.org/people/721720/overviewRole: Role: Role: Role:

Thomas F. Wulff: Role: Role: Role: Role: Role:

Karin Hahnke: Role: Role: Role:

Knut Finstermeier: Role: Role: Role: Role: Role:

Moïse Mansour: URI : https://loop.frontiersin.org/people/2711448/overviewRole: Role:

Alexander K. W. Elsholz: URI : https://loop.frontiersin.org/people/843607/overviewRole: Role: Role:

Emmanuelle Charpentier: URI : https://loop.frontiersin.org/people/2662957/overviewRole: Role: Role: Role:

Journal

Journal ID (nlm-ta): Front Bioeng Biotechnol

Journal ID (iso-abbrev): Front Bioeng Biotechnol

Journal ID (publisher-id): Front. Bioeng. Biotechnol.

Title: Frontiers in Bioengineering and Biotechnology

Publisher: Frontiers Media S.A.

ISSN (Electronic): 2296-4185

Publication date (Electronic): 07 June 2024

Publication date Collection: 2024

Volume: 12

Electronic Location Identifier: 1395659

Affiliations

[1] ¹ Max Planck Unit for the Science of Pathogens , Berlin, Germany

[2] ² Institut für Biologie , Humboldt-Universität zu Berlin , Berlin, Germany

Author notes

Edited by: Yaojun Tong, Shanghai Jiao Tong University, China

Reviewed by: Yunzi Luo, Tianjin University, China

Raquel Quatrini, Science for Life Foundation, Chile

*Correspondence: Emmanuelle Charpentier, research@ 123456emmanuelle-charpentier.org

[ † ]

ORCID: Nina Lautenschläger, orcid.org/0000-0002-6374-8118; Carolin Schiffer, orcid.org/0000-0002-4674-0894; Thomas F. Wulff, orcid.org/0000-0001-7166-0899; Knut Finstermeier, orcid.org/0009-0004-3420-1131; Moïse Mansour, orcid.org/0000-0003-1028-6113; Emmanuelle Charpentier, orcid.org/0000-0002-0254-0778

Article

Publisher ID: 1395659

DOI: 10.3389/fbioe.2024.1395659

PMC ID: 11190166

PubMed ID: 38911550

SO-VID: 7a4abfb9-0787-4cc1-850c-f97883faec98

License:

This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

History

Date received : 04 March 2024

Date accepted : 06 May 2024

Funding

The author(s) declare that financial support was received for the research, authorship, and/or publication of this article. This research project was supported by the Max Planck Society (EC) and the German Research Foundation (DFG, Leibniz Prize to EC).

Custom metadata

section-at-acceptance Synthetic Biology

Keywords: streptococcus pyogenes,plasmid collection,inducible promoter,genetic engineering,reporter gene,scarless gene deletion,synthetic biology

Data availability:

Keywords: streptococcus pyogenes, plasmid collection, inducible promoter, genetic engineering, reporter gene, scarless gene deletion, synthetic biology

Expanding the genetic toolbox for the obligate human pathogen Streptococcus pyogenes

Read this article at

Abstract

Related collections

iGEM

Most cited references 110

The Sequence Alignment/Map format and SAMtools

The variant call format and VCFtools

Integrative Genomics Viewer

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Funding

Categories

Custom metadata

Comments

Comment on this article

Similar content 171

Most referenced authors 1,338