Recent applications of deep learning and machine intelligence on  in silico  drug discovery: methods, tools and databases

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The identification of interactions between drugs/compounds and their targets is crucial for the development of new drugs. In vitro screening experiments (i.e. bioassays) are frequently used for this purpose; however, experimental approaches are insufficient to explore novel drug-target interactions, mainly because of feasibility problems, as they are labour intensive, costly and time consuming. A computational field known as ‘virtual screening’ (VS) has emerged in the past decades to aid experimental drug discovery studies by statistically estimating unknown bio-interactions between compounds and biological targets. These methods use the physico-chemical and structural properties of compounds and/or target proteins along with the experimentally verified bio-interaction information to generate predictive models. Lately, sophisticated machine learning techniques are applied in VS to elevate the predictive performance.

The objective of this study is to examine and discuss the recent applications of machine learning techniques in VS, including deep learning, which became highly popular after giving rise to epochal developments in the fields of computer vision and natural language processing. The past 3 years have witnessed an unprecedented amount of research studies considering the application of deep learning in biomedicine, including computational drug discovery. In this review, we first describe the main instruments of VS methods, including compound and protein features (i.e. representations and descriptors), frequently used libraries and toolkits for VS, bioactivity databases and gold-standard data sets for system training and benchmarking. We subsequently review recent VS studies with a strong emphasis on deep learning applications. Finally, we discuss the present state of the field, including the current challenges and suggest future directions. We believe that this survey will provide insight to the researchers working in the field of computational drug discovery in terms of comprehending and developing novel bio-prediction methods.

Related collections

Most cited references 170

Record: found
Abstract: not found
Article: not found

Identification of common molecular subsequences.

T.F. Smith, M.S. Waterman (1981)

0 comments Cited 1721 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Amino acid substitution matrices from protein blocks.

S Henikoff, J. Henikoff (1992)

Methods for alignment of protein sequences typically measure similarity by using a substitution matrix with scores for all possible exchanges of one amino acid with another. The most widely used matrices are based on the Dayhoff model of evolutionary rates. Using a different approach, we have derived substitution matrices from about 2000 blocks of aligned sequence segments characterizing more than 500 groups of related proteins. This led to marked improvements in alignments and in searches using queries from each of the groups.

0 comments Cited 1110 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Improved protein-ligand docking using GOLD.

Marcel Verdonk, Jason Cole, Michael J Hartshorn … (2003)

The Chemscore function was implemented as a scoring function for the protein-ligand docking program GOLD, and its performance compared to the original Goldscore function and two consensus docking protocols, "Goldscore-CS" and "Chemscore-GS," in terms of docking accuracy, prediction of binding affinities, and speed. In the "Goldscore-CS" protocol, dockings produced with the Goldscore function are scored and ranked with the Chemscore function; in the "Chemscore-GS" protocol, dockings produced with the Chemscore function are scored and ranked with the Goldscore function. Comparisons were made for a "clean" set of 224 protein-ligand complexes, and for two subsets of this set, one for which the ligands are "drug-like," the other for which they are "fragment-like." For "drug-like" and "fragment-like" ligands, the docking accuracies obtained with Chemscore and Goldscore functions are similar. For larger ligands, Goldscore gives superior results. Docking with the Chemscore function is up to three times faster than docking with the Goldscore function. Both combined docking protocols give significant improvements in docking accuracy over the use of the Goldscore or Chemscore function alone. "Goldscore-CS" gives success rates of up to 81% (top-ranked GOLD solution within 2.0 A of the experimental binding mode) for the "clean list," but at the cost of long search times. For most virtual screening applications, "Chemscore-GS" seems optimal; search settings that give docking speeds of around 0.25-1.3 min/compound have success rates of about 78% for "drug-like" compounds and 85% for "fragment-like" compounds. In terms of producing binding energy estimates, the Goldscore function appears to perform better than the Chemscore function and the two consensus protocols, particularly for faster search settings. Even at docking speeds of around 1-2 min/compound, the Goldscore function predicts binding energies with a standard deviation of approximately 10.5 kJ/mol. Copyright 2003 Wiley-Liss, Inc.

0 comments Cited 599 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Brief Bioinform

Journal ID (iso-abbrev): Brief. Bioinformatics

Journal ID (publisher-id): bib

Title: Briefings in Bioinformatics

Publisher: Oxford University Press

ISSN (Print): 1467-5463

ISSN (Electronic): 1477-4054

Publication date (Print): September 2019

Publication date (Electronic): 31 July 2018

Publication date PMC-release: 31 July 2018

Volume: 20

Issue: 5

Pages: 1878-1912

Affiliations

[1 ] Department of Computer Engineering, Middle East Technical University , Ankara, Turkey

[1a ] Department of Computer Engineering, İskenderun Technical University , Hatay, Turkey

[2 ] Cancer System Biology Laboratory (CanSyL), Graduate School of Informatics, Middle East Technical University , Ankara, Turkey

[3 ] European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL–EBI) , Cambridge, Hinxton, UK

[4 ] Cancer System Biology Laboratory (CanSyL), Graduate School of Informatics, Middle East Technical University, Ankara, Turkey and European Molecular Biology Laboratory , European Bioinformatics Institute (EMBL–EBI), Cambridge, Hinxton, UK

Author notes

Corresponding author: Tunca Doğan, Cancer System Biology Laboratory (CanSyL), Graduate School of Informatics, Middle East Technical University, Ankara, 06800, Turkey. E-mail: tuncadogan@ 123456gmail.com

Article

Publisher ID: bby061

DOI: 10.1093/bib/bby061

PMC ID: 6917215

PubMed ID: 30084866

SO-VID: a0f75990-7a1a-4797-bfbb-a1a59a4c02ab

License:

This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

History

Date received : 25 January 2018

Date revision received : 25 May 2018

Page count

Pages: 36

Funding

Funded by: Turkish Ministry of Development

Funded by: KanSiL

Award ID: KanSil_2016K121540

Funded by: Newton/Katip Celebi Institutional Links

Funded by: TUBITAK

Funded by: Turkey and British Council

Award ID: 116E930

Funded by: European Molecular Biology Laboratory 10.13039/100013060

Comments

Comment on this article

scite_

Cited by 131

See all cited by

- Version 1

Recent applications of deep learning and machine intelligence on in silico drug discovery: methods, tools and databases

Read this article at

Abstract

Related collections

Annual Reviews AI, Machine Learning, and Society

Most cited references 170

Identification of common molecular subsequences.

Amino acid substitution matrices from protein blocks.

Improved protein-ligand docking using GOLD.

Author and article information

Journal

Affiliations

Author notes

Article

History

Page count

Funding

Categories

Comments

Comment on this article

Similar content 32

Cited by 131