Inviting an author to review:
Find an author and click ‘Invite to review selected article’ near their name.
Search for authorsSearch for similar articles
3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      Roles of the components of the cag -pathogenicity island encoded type IV secretion system in Helicobacter pylori

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Related collections

          Most cited references96

          • Record: found
          • Abstract: found
          • Article: not found

          Biological diversity of prokaryotic type IV secretion systems.

          Type IV secretion systems (T4SS) translocate DNA and protein substrates across prokaryotic cell envelopes generally by a mechanism requiring direct contact with a target cell. Three types of T4SS have been described: (i) conjugation systems, operationally defined as machines that translocate DNA substrates intercellularly by a contact-dependent process; (ii) effector translocator systems, functioning to deliver proteins or other macromolecules to eukaryotic target cells; and (iii) DNA release/uptake systems, which translocate DNA to or from the extracellular milieu. Studies of a few paradigmatic systems, notably the conjugation systems of plasmids F, R388, RP4, and pKM101 and the Agrobacterium tumefaciens VirB/VirD4 system, have supplied important insights into the structure, function, and mechanism of action of type IV secretion machines. Information on these systems is updated, with emphasis on recent exciting structural advances. An underappreciated feature of T4SS, most notably of the conjugation subfamily, is that they are widely distributed among many species of gram-negative and -positive bacteria, wall-less bacteria, and the Archaea. Conjugation-mediated lateral gene transfer has shaped the genomes of most if not all prokaryotes over evolutionary time and also contributed in the short term to the dissemination of antibiotic resistance and other virulence traits among medically important pathogens. How have these machines adapted to function across envelopes of distantly related microorganisms? A survey of T4SS functioning in phylogenetically diverse species highlights the biological complexity of these translocation systems and identifies common mechanistic themes as well as novel adaptations for specialized purposes relating to the modulation of the donor-target cell interaction.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Helicobacter exploits integrin for type IV secretion and kinase activation.

            Integrins are important mammalian receptors involved in normal cellular functions as well as pathogenesis of chronic inflammation and cancer. We propose that integrins are exploited by the gastric pathogen and type-1 carcinogen Helicobacter pylori for injection of the bacterial oncoprotein cytotoxin-associated gene A (CagA) into gastric epithelial cells. Virulent H. pylori express a type-IV secretion pilus that injects CagA into the host cell; CagA then becomes tyrosine-phosphorylated by Src family kinases. However, the identity of the host cell receptor involved in this process has remained unknown. Here we show that the H. pylori CagL protein is a specialized adhesin that is targeted to the pilus surface, where it binds to and activates integrin alpha5beta1 receptor on gastric epithelial cells through an arginine-glycine-aspartate motif. This interaction triggers CagA delivery into target cells as well as activation of focal adhesion kinase and Src. Our findings provide insights into the role of integrins in H.-pylori-induced pathogenesis. CagL may be exploited as a new molecular tool for our further understanding of integrin signalling.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              A Global Overview of the Genetic and Functional Diversity in the Helicobacter pylori cag Pathogenicity Island

              Introduction Helicobacter pylori persistently infects more than one half of all humans, and can cause ulcer disease, gastric cancer, and MALT lymphoma [1]. The H. pylori cag pathogenicity island (cagPAI) is an intriguing virulence module of this obligate host-associated bacterium [2]–[4]. H. pylori strains that possess a functional cagPAI are particularly frequently associated with severe sequelae, notably gastric atrophy and cancer [4]–[7]. The cagPAI is ∼37 kb long, and contains ∼28 genes [3]. These genes encode multiple structural components of a bacterial type IV secretion system (t4ss) as well as the 128 kDa effector protein, CagA [7]. After H. pylori has adhered to a host cell, the Cag t4ss translocates CagA into that cell. CagA is subsequently phosphorylated by host cell kinases and interacts with multiple targets (e.g. SHP-2, Grb2, FAK), profoundly altering host cellular functions [8], [9]. The alterations induced by the cagPAI are thought to ultimately contribute to malignant transformation [4], [10], and CagA has been designated a bacterial oncoprotein [11]. H. pylori has a high mutation rate, which has resulted in extensive genetic diversity [12], and also recombines frequently with other H. pylori [13]. H. pylori isolates have been subdivided into distinct biogeographic populations and subpopulations with specific geographical distributions that reflect ancient human migrations [14]–[16]. The global population structure of H. pylori is now well understood based on multilocus haplotypes from seven housekeeping genes. However, very little is known about the biogeographic variation of virulence factors, such as the cagPAI, nor has the impact of genetic variation on disease outcome and host adaptation been adequately addressed. Previous analyses on the basis of comparative genome hybridization have demonstrated marked differences between biogeographic populations with respect to the cagPAI [17]. Microarray analysis of 56 globally representative strains of H. pylori revealed that the cagPAI was present in almost all strains from some biogeographic populations and subpopulations in Africa and Asia, while it was variably present in other populations [17]. The cagPAI was lacking in all isolates of hpAfrica2, which is distantly related to the other populations [17]. Currently, nine complete cagPAI sequences are publicly available [2], [18]–[22], whose isolates belong to hpEurope (7 sequences), hspWAfrica (1) and hspEAsia (1) (see Results), and no sequence data is available for the cagPAI in the other six populations and subpopulations where the cagPAI is present. Here we analyze complete cagPAI sequences from 38 isolates representing all known H. pylori populations and subpopulations and compare their genetic polymorphisms with measures of functional expression. Our data show that the cagPAI has shared a long evolutionary history with the H. pylori core genome, and displays a remarkable global conservation of gene content, structure and function, with minor exceptions. We provide evidence that the cagPAI was acquired by ancestral H. pylori in a single event that occurred before modern humans migrated out of Africa. Sequence comparisons identified domains in multiple components of the t4ss that are likely to be under diversifying selection, and these findings can guide future research into the function of t4ss components. Results Distribution of the cagPAI in a global collection of H. pylori In order to define the occurrence of the cagPAI in H. pylori, we screened a globally representative collection of H. pylori isolates from 53 different geographical or ethnic sources [15], [16] (Figure 1). 877 isolates were tested for the presence of the cagPAI by a PCR approach. Strains were classified as cagPAI-positive if we succeeded in separate PCR amplifications for the 5′ and 3′ ends of the cagPAI, or as cagPAI-negative if we succeeded in amplifying an empty site with primers from the flanking regions. The cagPAI was present in at least 95% of strains assigned to the hpAfrica1 (hspWAfrica plus hspSAfrica), hpEastAsia (hspEAsia, hspMaori) and hpAsia2 populations. In contrast, none of the hpAfrica2 strains possessed the cagPAI, and it was only variably present in strains from the populations hpEurope (225/330 strains; 58%), hpNEAfrica (58/72: 81%), and hpSahul (32/49; 65%) or the hspAmerind subpopulation of hpEastAsia (5/18; 28%). 10.1371/journal.pgen.1001069.g001 Figure 1 Distribution of the cag pathogenicity island in a global collection of H. pylori strains from different populations. (A) Neighbor joining (NJ) tree of neutral genetic relatedness of H. pylori strains, including information about the presence or absence of the cagPAI. The NJ tree was calculated from concatenated sequences of seven housekeeping genes (length 3406 bp) from 877 isolates of H. pylori [16] plus 9 additional isolates from which either cagPAI sequences [20] or whole genome sequences had been published (indicated by arrows; [2], [18]–[22]. Each strain was scored for presence (filled triangles) or absence (empty circles) of the cagPAI based on the results of PCR reactions that span the ends of the cagPAI. Population assignments based on Bayesian analyses [15], [16] are indicated by the color coding of symbols that correspond to the labels next to the tree; red symbols indicate all strains whose cagPAI sequences are now available, including the 29 strains that have been newly selected for cagPAI sequence analysis. (B) Geographic sources of strains whose cagPAI sequences are now available. Each dot indicates the source of isolation of one of the 38 cagPAI sequences that were analyzed. The dots are color-coded by population or subpopulation as in (A). Based on their multilocus sequence typing (MLST) haplotypes, seven strains with published cagPAI sequences belong to the hpEurope population (NCTC11638 from Australia [2]; 26695 from England [18]; and DU23, DU52, Ca52, Ca73 [20] and HPAG1 [21] from Sweden). J99 from the U.S.A. [22] belongs to hpAfrica1, and F32 [19] from Japan belongs to the hspEAsia population of hpEastAsia. None of these published cagPAI sequences were from strains of the hpNEAfrica, hpSahul, or hpAsia2 populations, from the hpEastAsia subpopulations hspAmerind or hspMaori, or from the hpAfrica1 subpopulation hspSAfrica, although those populations are also potentially important for our understanding of the evolutionary history of H. pylori. We therefore selected 29 strains from our global strain collection to supplement these nine published cagPAI sequences and provide a globally representative sample of cagPAI diversity (Figure 1). These strains included all known biogeographic populations, except for the cag-negative hpAfrica2. The entire cagPAI, approximately 37 kilobasepairs in length, was sequenced and annotated from each of the 29 strains, either after shot-gun cloning of overlapping long-range PCR products or via direct amplification of multiple, smaller PCR products. Conserved synteny and low macrodiversity in the cagPAI The 38 complete cagPAI sequences were compared by pairwise sequence alignments and by a multiple alignment in Kodon relative to the cagPAI from J99 used as a scaffold sequence (Figure 2). The general pattern of gene content and gene order (signifying macrodiversity) was similar in most sequences, with only limited variation due to changed synteny or deletions. Synteny changes resulted from genomic rearrangements, horizontal genetic exchange (e.g. replacement of HP0521 by HP0521b), possibly in conjunction with IS (insertion sequence) element insertion, or gene inversions, such as for HP0535. Insertions, deletions, point mutations, frameshift mutations or disruption through insertion elements (Figure S1) were also observed in some of the cagPAI sequences, some of which should have resulted in pseudogenes. We therefore tested all strains for their ability to induce interleukin-8 (IL-8) in gastric epithelial cells (Figure 2, Figure 3), as an indicator of PAI function [23]. Most of the strains containing a cagPAI were able to induce IL-8, indicating that many of the mutations did not drastically reduce the general function of the cagPAI (Table 1). 10.1371/journal.pgen.1001069.g002 Figure 2 Conservation of the cagPAI genetic organization across H. pylori biogeographic populations. The sequences were aligned in KODON using the cagPAI of strain J99 as a scaffold sequence. Individual isolates are grouped according to biogeographic (sub-)populations. The continuity of the cagPAI was disrupted in isolates PAL3414, V225 and HUI1769, and fragments found in secondary locations are displayed in grey-shaded boxes on separate lines. The two cagPAI sequences from reference strains J99 and 26695 were extracted from whole genomes. Genes essential for a basic function of the cagPAI type IV secretion system (IL-8 induction; [3]) are labeled with an asterisk*. Activity of the Cag t4ss (IL-8 secretion; + or −) was monitored during experimental infection of AGS cells with H. pylori. Obs., observed IL-8 secretion; exp., IL-8 secretion expected from the cagPAI sequence; red, genes in forward orientation; blue, genes in reverse orientation; light blue, shorter gene version; white, different gene HP521B [20] in this locus; yellow, pseudogenes; black, IS elements; green, cagPAI insertion sites. Diamonds: frameshift mutations leading to pseudogenes. Δ followed by numbers 1 through 10 indicate different deletions (manifestation of macrodiversity) and are consecutively numbered as mentioned in the text and Table 1. a,b,c,d: strains not functionally tested in this study possess functional cagPAIs according to the following references: a [20]; b [21]; c [2]; d [19]. 10.1371/journal.pgen.1001069.g003 Figure 3 Variability of Cag t4ss function in H. pylori strains from different biogeographic populations. (A) IL-8 induction in human gastric epithelial cells by diverse H. pylori strains from different biogeographic populations. IL-8 secretion induced at 20 h post infection by live H. pylori in gastric epithelial cells (AGS, shown here, and MKN28, data not shown) was determined as a read-out for Cag t4ss activity. The two strains J99 and 26695A, for which entire genome sequences are available, were included as positive controls. CagA EPIYA motifs for each strain are indicated on top of the graph. Exceptions in the genetic integrity of some of the islands and other explanations for an observed loss of functionality are indicated above the single bars. Colored bars designate the population assignments of strains. Coincubation experiments were performed independently at least three times for each strain, with similar results, and one representative experiment, performed in triplicates for each strain, is shown. IL-8 secretion is depicted in relative values, as a multiple of the negative control (mock), which was set to 1. (B,C) Assessing underlying causes of loss of function of cagPAIs in some H. pylori strains. (B) CagA translocation assays performed after infection of AGS cells with the two selected H. pylori strains D3A and M49. These displayed loss of cagPAI-related activity in IL-8 release assays. Both strains were unable to translocate CagA into human gastric epithelial cells. Strains SU2, N6, and 26695A wild type (wt) were used as positive controls for CagA translocation. Strains SU2Δcag and 26695AΔcag (isogenic cagPAI deletion mutants to SU2 and 26695A) were included as negative controls. (C) transcript amounts of single cagPAI genes. 30 strains (4 strains shown here – for complete results see Table S3) were studied using semiquantitative RT PCR for each gene with known function in the Cag t4ss (refer to Table 2 for gene names). Two strains with loss of t4ss function, CC72C, and M49, are shown. TAI196 and 26695A are depicted as positive controls. TAI196, a strain with a high propensity to induce IL-8, shows relatively high transcript amounts for the majority of genes. Strains CC42C and L72 (not shown) which have pseudogenes and lost the ability to induce IL-8, showed low or undetectable transcript amounts for some genes including the pseudogenes. M49 displayed low transcript amounts for a number of essential genes of the t4ss located predominantly in the right half of the cagPAI (genes HP0528, and HP0537 to HP0544). 10.1371/journal.pgen.1001069.t001 Table 1 Genetic macro- and minidiversity variants (gene order and orientation, gene identity, insertion elements) within the H. pylori cagPAI with regard to population assignments. Frequency Type Occurrence (population or strain) Gene and/or position in J99 IL-8 induction Frequent Mini-IS606a hspWAfrica (3/4), hpEurope (7/12), hspAmerind (3/3) 41–232 (192 bp) + Mini-IS606b hpAsia2 (6/6), hspEAsia (4/6), hspMaori (3/3) 21300 (140 bp) + Mini-IS606c hpAfrica1 (6/6), hpEurope (8/12), hpAsia2 (6/6), hpNEAfrica (1/1), hspAmerind (3/3) 36969–37098 (130 bp) + Mini-IS606d hspEAsia (6/6), hspMaori (3/3), hpEurope (3/12) 36969–37098 (303 bp) + Inversion hpEastAsia (11/12)1 HP0535 + Deletion 2 (Δ2) hpEastAsia (11/12) HP0521, 843–1467 + Shortened gene hpAsia2 (6/6), hspEAsia (1/6), hpEurope (2/12), hspWAfrica (1/4)2 HP0521 + Rearrangement HspAmerind (2/3) HP0536 – HP547, 21182–36740 + Mini-IS605 hpEurope (5/12), hpNEAfrica (1/1), hpAsia2 (1/6) 37003 + Replacement hpEurope (5/12), hpNEAfrica (1/1)3 HP0521B, 797–1392 + Rare Frameshift CC42C HP0524, 5166 (+1, 7C → 8C) − Frameshift CC42C HP0527, 12851 (+1, 3A → 4A) − Frameshift CC42C HP0529, 16416 (+1, 2T → 3T) − Frameshift CC42C HP0537, 22392 (−1, 7A → 6A) − Frameshift HPAG1 HP0527, 11326 (−1, 6A → 5A) +4 Frameshift HPAG1 HP0544, 30118 (−1, 3A → 2A) +4 Frameshift L72 HP0547, 34034 (−1, 7A → 6A) − Stop codon L72 HP0530, 16932, CGA → TGA − Mini-IS605 MOR3457 17505 + Mini-IS606e 26695 36969–37003 (35 bp) + IS605 NCTC11638, HUI17695 20345 + IS606 Ca52 3605 + IS606 CC42C 30450–33503 − IS607 RE120016 37718 + IS608 HUI1769 32724 + Rearrangement NCTC116387 HP0535 – HP0549, 20345 + Rearrangement DU52:2, PAL3414 HP0547 – HP0549, 33360 + Deletion 1 (Δ1) Ca52 618–1467 + Deletion 3 (Δ3) CC42C 30450–33503 − Deletion 4 (Δ4) HUI1692 21182–33406 − Deletion 5 (Δ5) V225, HUI1769 21182–32492 +8 Deletion 6 (Δ6) HUI1692 33593–34247 + Deletion 7 (Δ7) V225 33596–34318 + Deletion 8 (Δ8) V225 33450–34247 + Deletion 9 (Δ 9) HUI1769 32669–33116 + Deletion 10 (Δ10) HUI1769 33692–34100 + a, b, c, d, e represent different genetic variants of mini-IS606; mini-IS606 variants c, d and e were collectively referred to as “remnant IS606* within the cag right end segment” by Kersulyte et al. [26]. 1 Also in 8/11 strains from Japan [19]. The inversion encompasses a total of 1230 bp that are present in hpAsia2 and consists of HP0535 plus 483 bp of upstream and 381 bp of downstream flanking non-coding DNA. The homologous stretch in J99 contains flanking non-coding DNA stretches of 50 bp upstream and 160 bp downstream that are replaced by 490 bp and 460 bp, respectively, in hpAsia2 strain KAZ3173 (see Figure S1). 2 357 bp versus 659 bp for HP0521 in J99. 3 Also in 34/63 strains from Sweden [20]. 4 IL-8 induction is according to data published by Oh et al. [21]. However, HP0527 and HP0544 possess frameshift mutations that would normally prevent induction of IL-8. 5 Found in 1/95 additional strains from a global survey (this study) and 11/40 strains from Italy [2]. 6 in 1/95 additional strains from a global survey (this study). 7 also found in 4/40 strains from Italy [2]. 8 Deletion would prevent IL-8 induction. IL-8 induction is observed because of the presence of HP0536 – HP0547 in another genomic location. Fixed and transient variants in cagPAI sequence organization Most new mutations are deleterious, whether associated with single nucleotide polymorphisms, mobile elements or genomic rearrangements, and will be removed by purifying selection. However, mutations without a drastic effect on fitness, so-called neutral or nearly neutral mutations, can remain as rare variants within a population for long time periods. The vast majority of such mutations remain at low frequency until they are (usually) lost due to genetic drift. Rare neutral mutations can become more frequent over time, or even become fixed, also due to genetic drift [24]. Still other mutations are under positive selection. These rapidly become frequent or fixed due to Darwinian selection. In isolated clonal populations, Muller's ratchet can even result in some deleterious mutations rising to high frequency [25] and the same is true of extreme bottlenecks, which can fix deleterious mutations immediately. These basic evolutionary principles indicate that the demographies of rare versus frequent mutations differ and should be examined separately. Frequent variants A number of frequent cagPAI macrodiversity variants were found, some of which were present in all isolates of at least one sub-population, or almost all isolates (Table 1). These included insertion events due to one of three variants of IS606 [26] or of a mini-IS605 insertion [27], [28], an inversion of gene HP0535 plus its flanking non-coding DNA, a deletion of either the complete HP0521 ORF (Δ2; Figure 2) or part of that ORF, or the replacement of HP0521 by the unrelated ORF HP0521B (Figure 2, Table 1). Additionally, most of the 3′ (right) half of the cagPAI is lacking in all three hspAmerind strains due to one of two similar 11.2 kb deletions with distinct 3′ ends (Δ4, Δ5; Figure 2). These large deletions terminate within HP0546, and are associated with a second (intergenic) deletion of 410 bp or a 620 bp deletion that terminates within the N-terminal part of HP0547 (cagA). In strains V225 and HUI1769, a copy of the deleted segment plus the HP0546 and HP0547 ORFs have translocated to a separate, currently unidentified, location of the chromosome, leaving a shortened version of HP0546 at the original location (Figure 2). It is interesting to note that IL-8 induction was not eliminated by any of these frequent mutations (Figure 2, Figure 3, Table 1), suggesting that they are not deleterious to cagPAI function, and might be neutral or even under positive selection. Rare variants Rare variants were present in only one or two strains, are probably transient, and will tend to disappear during genetic drift [29]. The rare variants included frameshift mutations in multiple ORFs within three single isolates (CC42C, HPAG1 and L72) and IS elements (mini-IS605, IS605, IS606, IS607 or IS608 [26]) that have integrated at distinct locations in 7 other isolates (Table 1; Figure S1). Our dataset consisted of only 38 isolates, and it was possible that these rare mutations might be more widely distributed. We therefore screened 95 other globally representative strains for the presence of IS605, IS606, IS607 or IS608 at those locations, but only identified two additional strains with IS element insertions, one each for IS605 (MOR3055 – hspWAfrica) and IS607 (BASQ9523 – hpEurope) (data not shown). Thus, strains carrying these particular insertion mutations really are rare. We also found two rare, distinct genomic rearrangements (Table 1). One of these was in strain NCTC11638 from Australia and has been reported previously [2]. It splits the cagPAI between ORFs HP0534 and HP0535 into two segments, one of which is translocated elsewhere in the genome, and is distinct from the split of the cagPAI in the hspAmerind strains. Previous analyses identified the same rearrangement in 4/40 strains from Italy [2], but it was not found in any of the other 38 cagPAI sequences analyzed here nor in any of the 95 other, globally representative strains that we investigated by PCR. The other rearrangement separated HP0547 (cagA) through HP0549 plus flanking DNA from the rest of the cagPAI. It has been previously described for two hpEurope strains from Sweden and one from Australia [20]. We found the same pattern in a fourth hpEurope strain isolated in Palestine (PAL3414). Both of these rearrangements were present in less than 5% of isolates. The 17 rare mutations were identified in a total of 12 isolates. Only three of those, CC42C, HUI1692 and L72, did not induce IL-8, indicating that the majority of the rare sequence changes also did not cause a severe loss of cagPAI function. This observation is compatible with most of the rare mutations being selective neutral or near-neutral. Genomic decay Three overlapping small deletions (Δ1, Δ2, Δ3) that removed the HP521 ORF were found in all but one hpEastAsia isolate, one hpEurope isolate and the hpSahul strain (Figure 2; Table 1), but those did not abolish cagPAI function (see above). Eight other deletions were found in four individual strains (Figure 2). Two of these isolates were unable to induce IL-8: CC42C (hspSAfrica) contains multiple frameshift mutations and an insertion of IS606 as well as deletion Δ11, which removes part of cagA (HP547). Δ4 and Δ6 deleted half of the cagPAI in hspAmerind strain HUI1692. The cagPAI is clearly decaying in both CC42C and HUI1692. In contrast, although deletions Δ5 and Δ7–Δ10 also removed large parts of the cagPAI in hspAmerind strains V225 and HUI1769, these deletions occurred in a segment that has been duplicated to a separate location (see above) and these two isolates remain able to induce IL-8. Thus, with one exception (Δ1), these deletions are rare and seem to be associated with accelerated decay of non-functional cagPAI genes. In addition, the cagPAI in non IL-8-inducing strain L72 also contained one frameshift and one premature stop codon in a coding region, and seems to be undergoing decay. Signatures of selection within individual cagPAI genes Darwinian selection for variation in coding regions can also be exerted at the nucleotide or protein level. We therefore analyzed sequence polymorphisms (microdiversity) in individual cagPAI genes for traces of such selection (Materials and Methods). Similar to housekeeping genes [30], almost all alleles of each cagPAI ORF were unique to one isolate among the 38 strains. Exceptionally, we identified duplicates of a single allelic sequence in six genes; in each case, the strains possessing the duplicate alleles were from a common population (Table S4). Occasional duplicate alleles within populations have also been described for housekeeping genes [30] and are considered to represent homologous recombination. Again, similar to housekeeping genes, most cagPAI genes seemed to be under purifying selection because their Ka/Ks ratios were ≤0.2 (Table 2). However, five genes (HP0534-0535, HP0538, HP0546-0547) showed signs of positive or diversifying selection because their overall Ka/Ks ratios were greater than 0.2; of these, cagA (HP0547) had the highest proportion of non-synonymous polymorphisms (Ka/Ks  = 0.45). However, Ka/Ks ratios are relatively insensitive indicators of Darwinian selection, which can act at the level of single protein epitopes or conformational domains. We therefore used a Bayesian method (PAML/CODEML [31]) to search MLST and cagPAI genes for codons that might be under diversifying selection (indicated by ω >1). Only two of the seven MLST housekeeping genes (trpC, yphC) contained an appreciable frequency (3.9%; 5.3%) of codons with posterior probabilities of ω >1 being above 0.95 (Table 2). In contrast, >5.3% of the codons matched this criterion in 10 of the 28 cagPAI ORFs (Table 2), including four of the five ORFs with high overall Ka/Ks ratios (HP0535, HP0538, HP0546, HP0547). 10.1371/journal.pgen.1001069.t002 Table 2 Sequence diversity, K s/K a ratios, and codons under diversifying selection in cagPAI and housekeeping genes (37 strains). Gene no. in strain 26695 Gene name Component of type IV secretion system Mean sequence diversity (π) Ka Ks Ratio Ka/Ks No. of codons Codons under diversifying selection (ω>1) (PAML) r * Number % HP0520§ cagζ u 0.030 0.016 0.084 0.190 115 10 8.70 0.36 HP0522 cagΔ u 0.047 0.019 0.147 0.131 481 9 1.87 0.72 HP0523§ cagγ VirB1 0.089 0.036 0.279 0.127 169 9 5.33 0.45 HP0524 cagβ VirD4 0.041 0.007 0.164 0.045 748 4 0.53 0.64 HP0525 cagα VirB11 0.025 0.005 0.124 0.044 330 2 0.61 0.71 HP0526 cagZ u 0.021 0.010 0.065 0.148 199 8 4.02 0.64 HP0527§ cagY VirB10 0.049 0.017 0.097 0.171 2797 433 15.48 0.62 HP0528 cagX VirB9 0.024 0.006 0.092 0.068 522 10 1.92 0.74 HP0529 cagW VirB6 0.025 0.008 0.080 0.102 536 17 3.17 0.25 HP0530 cagV VirB8 0.024 0.006 0.093 0.066 252 9 3.57 0.50 HP0531 cagU u 0.032 0.012 0.095 0.123 218 4 1.83 0.60 HP0532 cagT VirB7 0.026 0.006 0.101 0.061 280 9 3.21 0.61 HP0534 cagS u 0.025 0.015 0.070 0.210 199 4 2.01 0.68 HP0535§ cagQ u 0.061 0.039 0.153 0.254 101 10 9.90 0.38 HP0536§ cagP u 0.031 0.011 0.079 0.138 117 7 5.98 0.43 HP0537 cagM u 0.026 0.008 0.097 0.078 376 4 1.06 0.52 HP0538§ cagN u 0.034 0.021 0.081 0.263 306 34 11.11 0.57 HP0539§ cagL VirB5 0.032 0.016 0.087 0.185 237 21 8.86 0.17 HP0540§ cagI u 0.032 0.017 0.087 0.196 381 23 6.04 0.40 HP0541 cagH u 0.027 0.010 0.087 0.110 370 10 2.70 0.26 HP0542 cagG u 0.029 0.010 0.097 0.102 143 0 0.00 0.57 HP0543 cagF u 0.029 0.014 0.095 0.143 268 10 3.73 0.52 HP0544 cagE VirB3/VirB4 0.026 0.005 0.103 0.049 984 9 0.91 0.62 HP0545 cagD u 0.039 0.016 0.122 0.134 209 5 2.39 0.38 HP0546§ cagC VirB2 0.051 0.031 0.112 0.277 116 7 6.03 0.33 HP0547§ cagA effector 0.088 0.067 0.150 0.448 1389 381 27.43 0.40 Merged cagPAI genes 0.040 0.012 0.115 0.106 - - - 0.65 HP1134 atpA 0.021 0.002 0.111 0.016 209 1 0.48 0.61 HP0177 efp 0.032 0.001 0.141 0.007 136 0 0.00 0.54 HP0142 mutY 0.058 0.018 0.198 0.089 140 0 0.00 0.61 HP0620 ppa 0.028 0.004 0.117 0.036 132 0 0.00 0.48 HP1279 trpC 0.069 0.030 0.204 0.149 152 6 3.95 0.57 HP0071 ureI 0.029 0.007 0.102 0.066 195 0 0.00 0.40 HP0834 yphC 0.042 0.015 0.140 0.107 170 9 5.29 0.64 Merged hk genes 0.041 0.011 0.141 0.076 - - - - Mantel test (r*) between matrices of individual cagPAI genes versus concatenated housekeeping genes. r* Pearson correlation coefficient of p-distance matrices from individual genes versus concatenated housekeeping genes. R values for the housekeeping genes were calculated from matrices of concatenated sequences jackknifing from the respective gene. § cag genes predicted to be under diversifying.selection (p>95% in > = 5.3% of codons in PAML). u  =  genes of partly or completely undefined function. Total number of codons per gene refers to alignment length used for PAML. We also tested eleven cagPAI ORFs, including nine with high frequencies of codons under selection according to PAML, and two with lower frequencies (HP0524, HP0525) with a second Bayesian program, OmegaMap [32], [33], which unlike PAML also takes into account the occurrence of recombination (ρ) between different alleles (Table S5). OmegaMap detected fewer codons with high probabilities of positive selection, but the codons that it identified often overlapped with codons that had been identified as being under positive selection by PAML (Table S5). Finally, we employed a sliding window along codons of PAML posterior probabilities of ω to identify clusters of sites with signs of diversifying selection (Figure 4). The combination of three forms of analysis (criteria: Ka/Ks >0.2, or likelihood of at least 95% for ω >1 in ≥5.3% of codons, or at least two clusters of two or more adjacent amino acids (aa) predicted under diversifying selection in PAML) identified 13 cagPAI genes that are likely to have evolved under diversifying selection: HP0520, HP0522, HP0523, HP0527, HP0528, HP0534, HP0535, HP0536, HP0538, HP0539, HP0540, HP0546 and HP0547. Of these, functions or structural contributions are known only for HP0523 (virB1), HP0527 (virB10), HP0539 (virB5), HP0546 (virB2) and HP0547 (cagA) [7], [34]–[38]. The percentage of codons with high likelihood of positive selection was highest in cagA (26.9%), followed by cagY (15.5%) and a gene of unknown function, cagQ (HP0535; 9.9%) (Table 2). In addition to a high frequency of putative codons under diversifying selection, HP0527 (cagY) and HP0547 (cagA) also exhibited variable gene lengths. This was due to variable numbers of repetitive modules within the genes, as previously reported [35], [39]. In the CagA protein, the number of phosphorylation sites (C-terminal EPIYA repeat motifs) differed, as did the types of these repeats (Figure 3). As previously described [39], the third EPIYA motif of CagA was type D in most (13/17) Asian strains whereas type D was not found in isolates from any other population. This reflected the preponderance of type D EPIYA in isolates assigned to the hpEastAsia and hpAsia2 populations. If the EPIYA type D motif were ancestral in Asian populations, this finding might reflect horizontal acquisition of cagA by the four exceptional Asian strains from Western strains. Homologous recombination involving the cagPAI has also been reported in isolates from Mestizos in Peru [40] and might reflect selection due to functional differences that are related to ethnic specificity. 10.1371/journal.pgen.1001069.g004 Figure 4 Sliding window map of maximum likelihood analysis of codons to be under diversifying selection for complete cagPAIs and housekeeping genes. Codons calculated by CODEML (model M3) to have a high likelihood p>95% of being under diversifying selection in each gene of the cagPAI or housekeeping genes of all analyzed strains are highlighted by black symbols. Comparison of cagPAI and housekeeping gene phylogeny We next asked whether the phylogeny of cagPAI genes was similar to that of housekeeping genes. Concatenated sequences of the cagPAI genes yielded a tree (Figure 5B) that is very similar to the tree based on a concatenate of the seven MLST housekeeping genes (Figure 5A). Similarly, matrices of pairwise genetic distances of the concatenated cagPAI genes were highly correlated with corresponding matrices of pairwise distances of concatenated housekeeping genes (R = 0.65, p 1), followed by Naive Empirical Bayes (NEB) and Bayes Empirical Bayes (BEB) analyses of posterior probabilities. Sites with a posterior probability P>0.95 by the CODEML codon substitution models M3 (discrete) or M8 (beta and ω) of ω>1 were considered as being under positive or diversifying selection. The likelihood of codons under diversifying selection in the presence of recombination was further analyzed using OmegaMap (V 0.5; [32]). This software uses a Bayesian modeling algorithm to calculate the probability of codons to evolve under diversifying selection (ω>1) in the presence of recombination (ρ). By explicitly modeling recombination, this method has a low rate to detect false positives. The settings used in the program were: norders  = 100, thinning  = 100, rhoprior  =  inverse, omegaprior  =  inverse, block length  = 3 and 100,000 or 250,000 iterations. 5,000 iterations were deduced after each calculation as the burn-in phase. The model type used for both ω and ρ was “variable”. Three repetitions of the calculations with different settings were initially performed for control genes of defined structural properties and where some information is available about their function (e.g. HP0546), to exclude high variations in the calculations due to inadequate settings. Pseudogenes were excluded from the dataset. Housekeeping genes and population structure Fragments of the housekeeping genes atpA, efp, mutY, ppa, trpC, ureI, and yphC were amplified and both strands were sequenced from independent PCR products as described [55]. Alternatively, comparable sequences were extracted from the published genomes (26695, HPAG1, J99). These sequences were assigned to populations and subpopulations by STRUCTURE [14]. Functional assays of the cagPAI t4ss IL-8 induction assay using the human gastric epithelial carcinoma cell line AGS (isolated from adenocarcinoma from a Caucasian patient) was performed for all strains of the sequencing project. Strain 26695A [60] was used as a reference. Cells were cultured in RPMI 1640 medium (buffered with 25 mM HEPES, supplemented with 10% heat-inactivated fetal bovine serum (medium and serum: Biochrom, Berlin, Germany). Details for bacterial culture conditions are given in Text S1. Cell infection experiments for IL-8 secretion measurement were performed on subconfluent cell layers (70%–90% confluence) in 24-well tissue culture plates. Cells were washed three times and preincubated in fresh medium with serum for 30 min prior to infection. By the addition of exponentially growing bacteria that were resuspended in cell culture medium (RPMI 1640, 25 mM HEPES, 10% heat-inactivated serum), the infection was started (MOI of 50). To synchronize the infection, the incubation plates were centrifuged at 500 x g, 20°C, for 3 min. The coincubation was carried out for 20 h. Non-infected cells (mock coincubated) were used as negative control. Supernatants were harvested, cleared of cell debris by centifugation, immediately frozen and stored at −20°C until use. Release of IL-8 into the cell supernatants was quantified by using BD OptEIA IL-8 enzyme-linked immunosorbent assay kit (BD Pharmingen; San Diego, USA) according to the company's instructions, using appropriate dilutions. The assays were performed in triplicate and the means and standard deviations of at least six independent coincubations were calculated. Adherence of the strains was tested in a high throughput assay, but no correlation was found between adherence and the IL-8 induction (data not shown). To study CagA translocation, AGS cells were cultured in six-well plates and infected with H. pylori at a multiplicity of infection (MOI) of 100. After 4 h of coincubaction, non-adherent bacteria were removed by washing twice with PBS-Dulbecco (pH 7.4; Biochrom, Berlin, Germany). Cells were harvested with a cell scraper and resuspended in 1 ml PBS (pH = 7.4; Biochrom, Berlin, Germany). After centrifugation (250 x g, 4°C, 5 min), cells were resuspended in 300 µl of modified RIPA buffer (20 mM Tris-HCl [pH 7.5], 150 mM NaCl, 1 mM EDTA, 1 mM EGTA, 1% Triton X-100, 2.5 mM sodium pyrophosphate, 1 mM β-glycerol phosphate, 1 mM sodium orthovanadate, 1 protease inhibitor tablet per 10 ml buffer (Complete, Roche, Mannheim, Germany), 1 mM PMSF). During lysis, cells were incubated on ice for 30 min. Lysates were cleared by centrifugation (10 min, 21,900 x g, 4°) and the pellets were carefully separated from the supernatants. The pellet fraction was resuspended in 100 µl RIPA buffer and the fractions were immediately frozen at −80°C. To determine the amount of protein, a BCA protein assay was performed using the BCA Protein Assay kit (Pierce, Rockford, IL, USA) according to the manufacturer's instructions. Western blot analysis of CagA translocation Equal amounts of cleared cell lysates (see above; corresponding to 10 µg of protein) of infected cells were resuspended in 5 x SDS loading buffer (0.31M Tris-HCl, pH6.8, 37.5% glycerol, 10% SDS, 0.05% bromophenol blue, 20% β-mercaptoethanol) and boiled for 10 min. For determination of molecular mass, BenchMark pre-stained Protein Ladder (Invitrogen, Karlsruhe, Germany) was used. Samples were separated on 10.4% denaturing SDS-polyacrylamide gels and transferred to nitrocellulose membranes (Protran BA 85, Whatman, Dassel, Germany) by semi-dry blotting. Membranes were blocked with 5% non-fat dried milk in TBS-T (20 mM Tris-HCl, 13.7 mM NaCl, 0.1% Tween 20, pH 7.4) for 1 h and subsequently incubated with specific primary antibody. Anti-CagA-antibody (Rabbit anti-H. pylori Cag antigen IgG fraction [polyclonal], Austral Biologicals, San Ramon, USA) was used at a dilution of 1/1,000 for the detection of CagA protein. To detect phosphorylated CagA, PY99-antibody (Santa Cruz Biotechnology, Heidelberg, Germany) was used (dilution 1/250). Goat-anti-Rabbit-HRP antibody (dilution 1/10,000, Jackson Immunoresearch Laboratories, Suffolk, Great Britain) or Goat-anti-mouse-HRP-antibody (dilution 1/5,000, Dianova, Hamburg, Germany) were used as secondary antibodies. Signal detection was performed with Enhanced SuperSignal West chemiluminescence substrate (Pierce, Rockford, IL, USA), and detection was on X-ray film (Hyperfilm, Amersham Biosciences, Buckinghamshire, UK). Supporting Information Figure S1 Distribution of IS and mini IS elements and repetitive sequences in diverse cagPAIs. Repetitive sequences and sites where insertion (IS) elements and mini IS elements have integrated are indicated by symbols. Green: cagPAI insertion site containing repetitive sequence; red rectangles: mini IS606 insertions; blue triangles: mini IS605 insertion sites. Mini-IS607 and mini IS608 elements were not identified. a,b,c,d,e: different genetic variants of IS606 insertion elements. (0.14 MB PDF) Click here for additional data file. Table S1 List of primers. (0.04 MB XLS) Click here for additional data file. Table S2 Primer list for transcript analyses of cagPAI genes. (0.02 MB XLS) Click here for additional data file. Table S3 Transcript table for selected cag genes with a role in cag t4ss function (IL-8 induction) and for cagA. (0.02 MB XLS) Click here for additional data file. Table S4 List of all identical alleles in single cag genes of the 38 analyzed cagPAIs. (0.03 MB DOC) Click here for additional data file. Table S5 Congruence between PAML (CODEML model M8) and OmegaMap analyses for probabilities of diversifying selection of sites in H. pylori cagPAI genes. (0.04 MB XLS) Click here for additional data file. Text S1 Supplementary Materials and Methods. (0.02 MB DOC) Click here for additional data file.
                Bookmark

                Author and article information

                Contributors
                Journal
                Future Microbiology
                Future Microbiology
                Informa UK Limited
                1746-0913
                1746-0921
                September 21 2024
                August 22 2024
                September 21 2024
                : 19
                : 14
                : 1253-1267
                Affiliations
                [1 ]Department of Gastroenterology, Lanzhou University Second Hospital, Lanzhou, People's Republic of China
                [2 ]Department of Gastroenterology, Key Laboratory of Digestive Diseases of Lanzhou University Second Hospital, Lanzhou, People's Republic of China
                [3 ]Department of Gastroenterology, The Second People's Hospital of Lanzhou City, Lanzhou, People's Republic of China
                Article
                10.1080/17460913.2024.2383514
                11633423
                39171625
                db33232c-2613-4e25-9999-a8a56fc7ec13
                © 2024
                History

                Comments

                Comment on this article

                scite_
                0
                0
                0
                0
                Smart Citations
                0
                0
                0
                0
                Citing PublicationsSupportingMentioningContrasting
                View Citations

                See how this article has been cited at scite.ai

                scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

                Similar content1,922

                Cited by3

                Most referenced authors977