10
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Machine learning discovery of high-temperature polymers

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Summary

          To formulate a machine learning (ML) model to establish the polymer's structure-property correlation for glass transition temperature T g , we collect a diverse set of nearly 13,000 real homopolymers from the largest polymer database, PoLyInfo. We train the deep neural network (DNN) model with 6,923 experimental T g values using Morgan fingerprint representations of chemical structures for these polymers. Interestingly, the trained DNN model can reasonably predict the unknown T g values of polymers with distinct molecular structures, in comparison with molecular dynamics simulations and experimental results. With the validated transferability and generalization ability, the ML model is utilized for high-throughput screening of nearly one million hypothetical polymers. We identify more than 65,000 promising candidates with T g > 200°C, which is 30 times more than existing known high-temperature polymers (∼2,000 from PoLyInfo). The discovery of this large number of promising candidates will be of significant interest in the development and design of high-temperature polymers.

          Graphical abstract

          Highlights

          • Large datasets for polymer's glass transition temperature are collected

          • Transferability of ML models depends on feature representations

          • Molecular dynamics models and experimental results validate the formulated ML model

          • Extensive promising candidates for high-temperature polymers are screened by ML model

          The bigger picture

          The design and development of high-temperature polymers has been an experimentally driven and trial-and-error process guided by experience, intuition, and conceptual insights. However, such an Edisonian approach is often costly, slow, biased toward certain chemical space domains, and limited to relatively small-scale studies, which may easily miss promising compounds. To overcome this challenge, we formulate a data-driven machine learning (ML) approach, integrated with high-fidelity molecular dynamics simulations, for quantitatively predicting the glass transition temperature of a polymer from its chemical structure and rapid screening of promising candidates for high-temperature polymers. Our work demonstrates that ML is a powerful method for the prediction and rapid screening of high-temperature polymers, particularly with growing large sets of experimental and computational data for polymeric materials.

          Abstract

          Polymers with outstanding high-temperature properties have been identified as promising materials for aerospace, electronics, and automotive applications. However, the current design and development of high-temperature polymers has been an experimentally driven and trial-and-error process guided by experience, intuition, and conceptual insights. Therefore, we formulate a machine learning model that can quantitatively predict the glass transition temperature of a polymer from its chemical structure, such that more promising high-temperature polymers can be efficiently filtered out through high-throughput screening.

          Related collections

          Most cited references103

          • Record: found
          • Abstract: not found
          • Article: not found

          SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Extended-connectivity fingerprints.

            Extended-connectivity fingerprints (ECFPs) are a novel class of topological fingerprints for molecular characterization. Historically, topological fingerprints were developed for substructure and similarity searching. ECFPs were developed specifically for structure-activity modeling. ECFPs are circular fingerprints with a number of useful qualities: they can be very rapidly calculated; they are not predefined and can represent an essentially infinite number of different molecular features (including stereochemical information); their features represent the presence of particular substructures, allowing easier interpretation of analysis results; and the ECFP algorithm can be tailored to generate different types of circular fingerprints, optimized for different uses. While the use of ECFPs has been widely adopted and validated, a description of their implementation has not previously been presented in the literature.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules

              We report a method to convert discrete representations of molecules to and from a multidimensional continuous representation. This model allows us to generate new molecules for efficient exploration and optimization through open-ended spaces of chemical compounds. A deep neural network was trained on hundreds of thousands of existing chemical structures to construct three coupled functions: an encoder, a decoder, and a predictor. The encoder converts the discrete representation of a molecule into a real-valued continuous vector, and the decoder converts these continuous vectors back to discrete molecular representations. The predictor estimates chemical properties from the latent continuous vector representation of the molecule. Continuous representations of molecules allow us to automatically generate novel chemical structures by performing simple operations in the latent space, such as decoding random vectors, perturbing known chemical structures, or interpolating between molecules. Continuous representations also allow the use of powerful gradient-based optimization to efficiently guide the search for optimized functional compounds. We demonstrate our method in the domain of drug-like molecules and also in a set of molecules with fewer that nine heavy atoms.
                Bookmark

                Author and article information

                Contributors
                Journal
                Patterns (N Y)
                Patterns (N Y)
                Patterns
                Elsevier
                2666-3899
                26 March 2021
                09 April 2021
                26 March 2021
                : 2
                : 4
                : 100225
                Affiliations
                [1 ]Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA
                [2 ]Polymer Program, Institute of Materials Science, University of Connecticut, Storrs, CT 06269, USA
                Author notes
                []Corresponding author yingli@ 123456engr.uconn.edu
                [3]

                These authors contributed equally

                [4]

                Lead contact

                Article
                S2666-3899(21)00039-8 100225
                10.1016/j.patter.2021.100225
                8085602
                33982020
                28ab1f0c-835e-428e-8059-636d986a7a10
                © 2021 The Authors

                This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

                History
                : 15 December 2020
                : 21 January 2021
                : 2 March 2021
                Categories
                Article

                machine learning,polymer,glass transition temperature,high-throughput screening,feature representation

                Comments

                Comment on this article