6
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Simulated Design–Build–Test–Learn Cycles for Consistent Comparison of Machine Learning Methods in Metabolic Engineering

      research-article

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Combinatorial pathway optimization is an important tool in metabolic flux optimization. Simultaneous optimization of a large number of pathway genes often leads to combinatorial explosions. Strain optimization is therefore often performed using iterative design–build–test–learn (DBTL) cycles. The aim of these cycles is to develop a product strain iteratively, every time incorporating learning from the previous cycle. Machine learning methods provide a potentially powerful tool to learn from data and propose new designs for the next DBTL cycle. However, due to the lack of a framework for consistently testing the performance of machine learning methods over multiple DBTL cycles, evaluating the effectiveness of these methods remains a challenge. In this work, we propose a mechanistic kinetic model-based framework to test and optimize machine learning for iterative combinatorial pathway optimization. Using this framework, we show that gradient boosting and random forest models outperform the other tested methods in the low-data regime. We demonstrate that these methods are robust for training set biases and experimental noise. Finally, we introduce an algorithm for recommending new designs using machine learning model predictions. We show that when the number of strains to be built is limited, starting with a large initial DBTL cycle is favorable over building the same number of strains for every cycle.

          Related collections

          Most cited references55

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          COBRApy: COnstraints-Based Reconstruction and Analysis for Python

          Background COnstraint-Based Reconstruction and Analysis (COBRA) methods are widely used for genome-scale modeling of metabolic networks in both prokaryotes and eukaryotes. Due to the successes with metabolism, there is an increasing effort to apply COBRA methods to reconstruct and analyze integrated models of cellular processes. The COBRA Toolbox for MATLAB is a leading software package for genome-scale analysis of metabolism; however, it was not designed to elegantly capture the complexity inherent in integrated biological networks and lacks an integration framework for the multiomics data used in systems biology. The openCOBRA Project is a community effort to promote constraints-based research through the distribution of freely available software. Results Here, we describe COBRA for Python (COBRApy), a Python package that provides support for basic COBRA methods. COBRApy is designed in an object-oriented fashion that facilitates the representation of the complex biological processes of metabolism and gene expression. COBRApy does not require MATLAB to function; however, it includes an interface to the COBRA Toolbox for MATLAB to facilitate use of legacy codes. For improved performance, COBRApy includes parallel processing support for computationally intensive processes. Conclusion COBRApy is an object-oriented framework designed to meet the computational challenges associated with the next generation of stoichiometric constraint-based models and high-density omics data sets. Availability http://opencobra.sourceforge.net/
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Overview of Next-Generation Sequencing Technologies

            High throughput DNA sequencing methodology (next generation sequencing; NGS) has rapidly evolved over the past 15 years and new methods are continually being commercialized. As the technology develops, so do increases in the number of corresponding applications for basic and applied science. The purpose of this review is to provide a compendium of NGS methodologies and associated applications. Each brief discussion is followed by web links to the manufacturer and/or web-based visualizations. Keyword searches, such as with Google, may also provide helpful internet links and information.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Tuning genetic control through promoter engineering.

              Gene function is typically evaluated by sampling the continuum of gene expression at only a few discrete points corresponding to gene knockout or overexpression. We argue that this characterization is incomplete and present a library of engineered promoters of varying strengths obtained through mutagenesis of a constitutive promoter. A multifaceted characterization of the library, especially at the single-cell level to ensure homogeneity, permitted quantitative assessment correlating the effect of gene expression levels to improved growth and product formation phenotypes in Escherichia coli. Integration of these promoters into the chromosome can allow for a quantitative accurate assessment of genetic control. To this end, we used the characterized library of promoters to assess the impact of phosphoenolpyruvate carboxylase levels on growth yield and deoxy-xylulose-P synthase levels on lycopene production. The multifaceted characterization of promoter strength enabled identification of optimal expression levels for ppc and dxs, which maximized the desired phenotype. Additionally, in a strain preengineered to produce lycopene, the response to deoxy-xylulose-P synthase levels was linear at all levels tested, indicative of a rate-limiting step, unlike the parental strain, which exhibited an optimum expression level, illustrating that optimal gene expression levels are variable and dependent on the genetic background of the strain. This promoter library concept is illustrated as being generalizable to eukaryotic organisms (Saccharomyces cerevisiae) and thus constitutes an integral platform for functional genomics, synthetic biology, and metabolic engineering endeavors.
                Bookmark

                Author and article information

                Journal
                ACS Synth Biol
                ACS Synth Biol
                sb
                asbcd6
                ACS Synthetic Biology
                American Chemical Society
                2161-5063
                24 August 2023
                15 September 2023
                : 12
                : 9
                : 2588-2599
                Affiliations
                []Delft Bioinformatics Lab, Delft University of Technology Van Mourik , Delft 2628 XE, The Netherlands
                []Department of Science and Research, Joep Schmitz - dsm-firmenich, Science & Research , P.O. Box 1, 2600 MA Delft, The Netherlands
                [§ ]Infectious Disease and Microbiome Program, Broad Institute of MIT and Harvard , Cambridge, Massachusetts 02142, United States
                Author notes
                [* ]Email: t.abeel@ 123456tudelft.nl . Phone: +31 15 27 85114.
                Author information
                https://orcid.org/0009-0001-2887-0193
                https://orcid.org/0000-0002-7205-7431
                Article
                10.1021/acssynbio.3c00186
                10510747
                37616156
                f5d58818-230a-45ca-aec6-2f221fd369dd
                © 2023 The Authors. Published by American Chemical Society

                Permits the broadest form of re-use including for commercial purposes, provided that author attribution and integrity are maintained ( https://creativecommons.org/licenses/by/4.0/).

                History
                : 30 March 2023
                Funding
                Funded by: Firmenich, doi 10.13039/100018220;
                Award ID: NA
                Funded by: DSM, doi 10.13039/501100022615;
                Award ID: NA
                Categories
                Research Article
                Custom metadata
                sb3c00186
                sb3c00186

                Molecular biology
                combinatorial pathway optimization,machine learning,dbtl cycles,metabolic engineering,automated recommendation

                Comments

                Comment on this article