5
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      LSTM-driven drug design using SELFIES for target-focused de novo generation of HIV-1 protease inhibitor candidates for AIDS treatment

      research-article
      1 , 1 , 2 , 3 , * ,
      PLOS ONE
      Public Library of Science

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The battle against viral drug resistance highlights the need for innovative approaches to replace time-consuming and costly traditional methods. Deep generative models offer automation potential, especially in the fight against Human immunodeficiency virus (HIV), as they can synthesize diverse molecules effectively. In this paper, an application of an LSTM-based deep generative model named “LSTM-ProGen” is proposed to be tailored explicitly for the de novo design of drug candidate molecules that interact with a specific target protein (HIV-1 protease). LSTM-ProGen distinguishes itself by employing a long-short-term memory (LSTM) architecture, to generate novel molecules target specificity against the HIV-1 protease. Following a thorough training process involves fine-tuning LSTM-ProGen on a diverse range of compounds sourced from the ChEMBL database. The model was optimized to meet specific requirements, with multiple iterations to enhance its predictive capabilities and ensure it generates molecules that exhibit favorable target interactions. The training process encompasses an array of performance evaluation metrics, such as drug-likeness properties. Our evaluation includes extensive silico analysis using molecular docking and PCA-based visualization to explore the chemical space that the new molecules cover compared to those in the training set. These evaluations reveal that a subset of 12 de novo molecules generated by LSTM-ProGen exhibit a striking ability to interact with the target protein, rivaling or even surpassing the efficacy of native ligands. Extended versions with further refinement of LSTM-ProGen hold promise as versatile tools for designing efficacious and customized drug candidates tailored to specific targets, thus accelerating drug development and facilitating the discovery of new therapies for various diseases.

          Related collections

          Most cited references29

          • Record: found
          • Abstract: found
          • Article: not found

          AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading.

          AutoDock Vina, a new program for molecular docking and virtual screening, is presented. AutoDock Vina achieves an approximately two orders of magnitude speed-up compared with the molecular docking software previously developed in our lab (AutoDock 4), while also significantly improving the accuracy of the binding mode predictions, judging by our tests on the training set used in AutoDock 4 development. Further speed-up is achieved from parallelism, by using multithreading on multicore machines. AutoDock Vina automatically calculates the grid maps and clusters the results in a way transparent to the user. Copyright 2009 Wiley Periodicals, Inc.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            DrugBank 5.0: a major update to the DrugBank database for 2018

            Abstract DrugBank (www.drugbank.ca) is a web-enabled database containing comprehensive molecular information about drugs, their mechanisms, their interactions and their targets. First described in 2006, DrugBank has continued to evolve over the past 12 years in response to marked improvements to web standards and changing needs for drug research and development. This year’s update, DrugBank 5.0, represents the most significant upgrade to the database in more than 10 years. In many cases, existing data content has grown by 100% or more over the last update. For instance, the total number of investigational drugs in the database has grown by almost 300%, the number of drug-drug interactions has grown by nearly 600% and the number of SNP-associated drug effects has grown more than 3000%. Significant improvements have been made to the quantity, quality and consistency of drug indications, drug binding data as well as drug-drug and drug-food interactions. A great deal of brand new data have also been added to DrugBank 5.0. This includes information on the influence of hundreds of drugs on metabolite levels (pharmacometabolomics), gene expression levels (pharmacotranscriptomics) and protein expression levels (pharmacoprotoemics). New data have also been added on the status of hundreds of new drug clinical trials and existing drug repurposing trials. Many other important improvements in the content, interface and performance of the DrugBank website have been made and these should greatly enhance its ease of use, utility and potential applications in many areas of pharmacological research, pharmaceutical science and drug education.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Open Babel: An open chemical toolbox

              Background A frequent problem in computational modeling is the interconversion of chemical structures between different formats. While standard interchange formats exist (for example, Chemical Markup Language) and de facto standards have arisen (for example, SMILES format), the need to interconvert formats is a continuing problem due to the multitude of different application areas for chemistry data, differences in the data stored by different formats (0D versus 3D, for example), and competition between software along with a lack of vendor-neutral formats. Results We discuss, for the first time, Open Babel, an open-source chemical toolbox that speaks the many languages of chemical data. Open Babel version 2.3 interconverts over 110 formats. The need to represent such a wide variety of chemical and molecular data requires a library that implements a wide range of cheminformatics algorithms, from partial charge assignment and aromaticity detection, to bond order perception and canonicalization. We detail the implementation of Open Babel, describe key advances in the 2.3 release, and outline a variety of uses both in terms of software products and scientific research, including applications far beyond simple format interconversion. Conclusions Open Babel presents a solution to the proliferation of multiple chemical file formats. In addition, it provides a variety of useful utilities from conformer searching and 2D depiction, to filtering, batch conversion, and substructure and similarity searching. For developers, it can be used as a programming library to handle chemical data in areas such as organic chemistry, drug design, materials science, and computational chemistry. It is freely available under an open-source license from http://openbabel.org.
                Bookmark

                Author and article information

                Contributors
                Role: ConceptualizationRole: Data curationRole: InvestigationRole: MethodologyRole: SoftwareRole: ValidationRole: Writing – original draftRole: Writing – review & editing
                Role: ConceptualizationRole: MethodologyRole: Project administrationRole: SupervisionRole: ValidationRole: Writing – review & editing
                Role: Editor
                Journal
                PLoS One
                PLoS One
                plos
                PLOS ONE
                Public Library of Science (San Francisco, CA USA )
                1932-6203
                2024
                21 June 2024
                : 19
                : 6
                : e0303597
                Affiliations
                [001] 1 Department of Computer Engineering, Istanbul Medipol University, Istanbul, Turkey
                [002] 2 Department of Computer Science, University of Calgary, Alberta, Canada
                [003] 3 Department of Health Informatics, University of Southern Denmark, Odense, Denmark
                Redesign Science, UNITED STATES
                Author notes

                Competing Interests: The authors have declared that no competing interests exist.

                Author information
                https://orcid.org/0000-0001-6657-9738
                Article
                PONE-D-24-04274
                10.1371/journal.pone.0303597
                11192380
                38905197
                f551de28-fe0f-4f25-b0cd-cb8ca0628479
                © 2024 Albrijawi, Alhajj

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 1 February 2024
                : 26 April 2024
                Page count
                Figures: 16, Tables: 5, Pages: 30
                Funding
                The author(s) received no specific funding for this work;.
                Categories
                Research Article
                Biology and Life Sciences
                Microbiology
                Medical Microbiology
                Microbial Pathogens
                Viral Pathogens
                Immunodeficiency Viruses
                HIV
                HIV-1
                Medicine and Health Sciences
                Pathology and Laboratory Medicine
                Pathogens
                Microbial Pathogens
                Viral Pathogens
                Immunodeficiency Viruses
                HIV
                HIV-1
                Biology and Life Sciences
                Organisms
                Viruses
                Viral Pathogens
                Immunodeficiency Viruses
                HIV
                HIV-1
                Biology and Life Sciences
                Organisms
                Viruses
                Immunodeficiency Viruses
                HIV
                HIV-1
                Biology and life sciences
                Organisms
                Viruses
                RNA viruses
                Retroviruses
                Lentivirus
                HIV
                HIV-1
                Biology and Life Sciences
                Microbiology
                Medical Microbiology
                Microbial Pathogens
                Viral Pathogens
                Retroviruses
                Lentivirus
                HIV
                HIV-1
                Medicine and Health Sciences
                Pathology and Laboratory Medicine
                Pathogens
                Microbial Pathogens
                Viral Pathogens
                Retroviruses
                Lentivirus
                HIV
                HIV-1
                Biology and Life Sciences
                Organisms
                Viruses
                Viral Pathogens
                Retroviruses
                Lentivirus
                HIV
                HIV-1
                Biology and Life Sciences
                Biochemistry
                Enzymology
                Enzymes
                Proteases
                Biology and Life Sciences
                Biochemistry
                Proteins
                Enzymes
                Proteases
                Biology and Life Sciences
                Biochemistry
                Enzymology
                Enzyme Inhibitors
                Protease Inhibitors
                Biology and Life Sciences
                Microbiology
                Medical Microbiology
                Microbial Pathogens
                Viral Pathogens
                Immunodeficiency Viruses
                HIV
                Medicine and Health Sciences
                Pathology and Laboratory Medicine
                Pathogens
                Microbial Pathogens
                Viral Pathogens
                Immunodeficiency Viruses
                HIV
                Biology and Life Sciences
                Organisms
                Viruses
                Viral Pathogens
                Immunodeficiency Viruses
                HIV
                Biology and Life Sciences
                Organisms
                Viruses
                Immunodeficiency Viruses
                HIV
                Biology and life sciences
                Organisms
                Viruses
                RNA viruses
                Retroviruses
                Lentivirus
                HIV
                Biology and Life Sciences
                Microbiology
                Medical Microbiology
                Microbial Pathogens
                Viral Pathogens
                Retroviruses
                Lentivirus
                HIV
                Medicine and Health Sciences
                Pathology and Laboratory Medicine
                Pathogens
                Microbial Pathogens
                Viral Pathogens
                Retroviruses
                Lentivirus
                HIV
                Biology and Life Sciences
                Organisms
                Viruses
                Viral Pathogens
                Retroviruses
                Lentivirus
                HIV
                Physical Sciences
                Chemistry
                Computational Chemistry
                Molecular Docking
                Medicine and Health Sciences
                Pharmacology
                Drug Research and Development
                Drug Discovery
                Physical Sciences
                Chemistry
                Chemical Physics
                Molecular Structure
                Physical Sciences
                Physics
                Chemical Physics
                Molecular Structure
                Computer and Information Sciences
                Data Management
                Data Visualization
                Infographics
                Graphs
                Custom metadata
                Please note that, we include all the utilized datasets under this repo: 1. Here is the link of a GitHub repository ( https://github.com/taleb-hub/LstmProGen.git) containing all the datasets used to train the model. 2. Additionally, the main resource database, ChEMBL, can be accessed using this link: https://www.ebi.ac.uk/chembl/web_components/explore/drugs/ To be more specific, the datasets underpinning the findings of this research are primarily sourced from the Chemble database, with a particular focus on the 14K drug dataset. Access to the Chemble database is available at https://www.ebi.ac.uk/chembl/. Specifically, the study relies on data extracted from the 14K drug dataset within the Chemble repository. To access the datasets utilized in this study, begin by accessing the Chemble database and locating the 14K drug dataset. This dataset serves as the cornerstone of our analysis. Subsequently, the secondary dataset utilized in our research originates from the same source; however, users must employ filters to isolate inhibitors targeting the five specified viruses (HCV, EIAV, WNV, and SARAS-CoV-2). Additionally, our analysis integrates data concerning established inhibitors of the HIV-1 protease. Upon accessing the 14K drug dataset within the Chemble repository, users should further refine the results to identify inhibitors specific to HIV-1. For further queries concerning data availability and access procedures, please feel free to contact us via email. We are committed to upholding the standards of data accessibility and stand ready to assist with any further queries or requests for information.

                Uncategorized
                Uncategorized

                Comments

                Comment on this article