Machine learning discovery of high-temperature polymers

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Summary

To formulate a machine learning (ML) model to establish the polymer's structure-property correlation for glass transition temperature $T_{g}$ , we collect a diverse set of nearly 13,000 real homopolymers from the largest polymer database, PoLyInfo. We train the deep neural network (DNN) model with 6,923 experimental $T_{g}$ values using Morgan fingerprint representations of chemical structures for these polymers. Interestingly, the trained DNN model can reasonably predict the unknown $T_{g}$ values of polymers with distinct molecular structures, in comparison with molecular dynamics simulations and experimental results. With the validated transferability and generalization ability, the ML model is utilized for high-throughput screening of nearly one million hypothetical polymers. We identify more than 65,000 promising candidates with $T_{g}$ > 200°C, which is 30 times more than existing known high-temperature polymers (∼2,000 from PoLyInfo). The discovery of this large number of promising candidates will be of significant interest in the development and design of high-temperature polymers.

Graphical abstract

Highlights

•

Large datasets for polymer's glass transition temperature are collected
•

Transferability of ML models depends on feature representations
•

Molecular dynamics models and experimental results validate the formulated ML model
•

Extensive promising candidates for high-temperature polymers are screened by ML model

The bigger picture

The design and development of high-temperature polymers has been an experimentally driven and trial-and-error process guided by experience, intuition, and conceptual insights. However, such an Edisonian approach is often costly, slow, biased toward certain chemical space domains, and limited to relatively small-scale studies, which may easily miss promising compounds. To overcome this challenge, we formulate a data-driven machine learning (ML) approach, integrated with high-fidelity molecular dynamics simulations, for quantitatively predicting the glass transition temperature of a polymer from its chemical structure and rapid screening of promising candidates for high-temperature polymers. Our work demonstrates that ML is a powerful method for the prediction and rapid screening of high-temperature polymers, particularly with growing large sets of experimental and computational data for polymeric materials.

Abstract

Polymers with outstanding high-temperature properties have been identified as promising materials for aerospace, electronics, and automotive applications. However, the current design and development of high-temperature polymers has been an experimentally driven and trial-and-error process guided by experience, intuition, and conceptual insights. Therefore, we formulate a machine learning model that can quantitatively predict the glass transition temperature of a polymer from its chemical structure, such that more promising high-temperature polymers can be efficiently filtered out through high-throughput screening.

Related collections

Most cited references 103

Record: found
Abstract: not found
Article: not found

SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules

David Weininger (1988)

0 comments Cited 984 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Extended-connectivity fingerprints.

David Rogers, Mathew Hahn (2010)

Extended-connectivity fingerprints (ECFPs) are a novel class of topological fingerprints for molecular characterization. Historically, topological fingerprints were developed for substructure and similarity searching. ECFPs were developed specifically for structure-activity modeling. ECFPs are circular fingerprints with a number of useful qualities: they can be very rapidly calculated; they are not predefined and can represent an essentially infinite number of different molecular features (including stereochemical information); their features represent the presence of particular substructures, allowing easier interpretation of analysis results; and the ECFP algorithm can be tailored to generate different types of circular fingerprints, optimized for different uses. While the use of ECFPs has been widely adopted and validated, a description of their implementation has not previously been presented in the literature.

0 comments Cited 923 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules

Rafael Gómez-Bombarelli, Jennifer N Wei, David Duvenaud … (2018)

We report a method to convert discrete representations of molecules to and from a multidimensional continuous representation. This model allows us to generate new molecules for efficient exploration and optimization through open-ended spaces of chemical compounds. A deep neural network was trained on hundreds of thousands of existing chemical structures to construct three coupled functions: an encoder, a decoder, and a predictor. The encoder converts the discrete representation of a molecule into a real-valued continuous vector, and the decoder converts these continuous vectors back to discrete molecular representations. The predictor estimates chemical properties from the latent continuous vector representation of the molecule. Continuous representations of molecules allow us to automatically generate novel chemical structures by performing simple operations in the latent space, such as decoding random vectors, perturbing known chemical structures, or interpolating between molecules. Continuous representations also allow the use of powerful gradient-based optimization to efficiently guide the search for optimized functional compounds. We demonstrate our method in the domain of drug-like molecules and also in a set of molecules with fewer that nine heavy atoms.

0 comments Cited 584 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Ying Li

Journal

Journal ID (nlm-ta): Patterns (N Y)

Journal ID (iso-abbrev): Patterns (N Y)

Title: Patterns

Publisher: Elsevier

ISSN (Electronic): 2666-3899

Publication date PMC-release: 26 March 2021

Publication date Collection: 09 April 2021

Publication date (Electronic): 26 March 2021

Volume: 2

Issue: 4

Electronic Location Identifier: 100225

Affiliations

[1 ]Department of Mechanical Engineering, University of Connecticut, Storrs, CT 06269, USA

[2 ]Polymer Program, Institute of Materials Science, University of Connecticut, Storrs, CT 06269, USA

Author notes

[∗ ]Corresponding author yingli@ 123456engr.uconn.edu

[3]

These authors contributed equally

[4]

Lead contact

Article

Publisher Item ID: S2666-3899(21)00039-8 Publisher ID: 100225

DOI: 10.1016/j.patter.2021.100225

PMC ID: 8085602

PubMed ID: 33982020

SO-VID: 28ab1f0c-835e-428e-8059-636d986a7a10

License:

This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).

History

Date received : 15 December 2020

Date revision received : 21 January 2021

Date accepted : 2 March 2021

Comments

Comment on this article

scite_

Cited by 20

See all cited by

Most referenced authors 920

See all reference authors

Machine learning discovery of high-temperature polymers

Read this article at

Summary

Graphical abstract

Highlights

The bigger picture

Abstract

Related collections

Annual Reviews AI, Machine Learning, and Society

Most cited references 103

SMILES, a chemical language and information system. 1. Introduction to methodology and encoding rules

Extended-connectivity fingerprints.

Automatic Chemical Design Using a Data-Driven Continuous Representation of Molecules

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Categories

Comments

Comment on this article

Similar content 25

Cited by 20

Most referenced authors 920