Model systems are a cornerstone of microbiology. However, despite microbiology’s heavy reliance on laboratory models, these systems are typically not analyzed systematically to improve their relevance. This limitation is a primary challenge to understand microbes’ physiology in natural environments. We provide a proof of concept for generalizable approaches for model improvement using transcriptomic data of the pathogen Pseudomonas aeruginosa from sputum of patients with cystic fibrosis. We quantitatively improve experimental model systems by 1) combining two models with different accuracies and 2) leveraging publicly available data to identify a condition (low zinc) that corrects the accuracy of target genes. These rationalized frameworks are broadly applicable and have the potential to reshape how we understand the role of microbes across ecosystems.
Laboratory models are critical to basic and translational microbiology research. Models serve multiple purposes, from providing tractable systems to study cell biology to allowing the investigation of inaccessible clinical and environmental ecosystems. Although there is a recognized need for improved model systems, there is a gap in rational approaches to accomplish this goal. We recently developed a framework for assessing the accuracy of microbial models by quantifying how closely each gene is expressed in the natural environment and in various models. The accuracy of the model is defined as the percentage of genes that are similarly expressed in the natural environment and the model. Here, we leverage this framework to develop and validate two generalizable approaches for improving model accuracy, and as proof of concept, we apply these approaches to improve models of Pseudomonas aeruginosa infecting the cystic fibrosis (CF) lung. First, we identify two models, an in vitro synthetic CF sputum medium model (SCFM2) and an epithelial cell model, that accurately recapitulate different gene sets. By combining these models, we developed the epithelial cell-SCFM2 model which improves the accuracy of over 500 genes. Second, to improve the accuracy of specific genes, we mined publicly available transcriptome data, which identified zinc limitation as a cue present in the CF lung and absent in SCFM2. Induction of zinc limitation in SCFM2 resulted in accurate expression of 90% of P. aeruginosa genes. These approaches provide generalizable, quantitative frameworks for microbiological model improvement that can be applied to any system of interest.