7
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Multi-modality machine learning predicting Parkinson’s disease

      research-article
      1 , 2 , 3 , 1 , 4 , 5 , 6 , 4 , 5 , 1 , 4 , 5 , 1 , 4 , 7 , 8 , 9 , 10 , 11 , 12 , 1 , 1 , 13 , 1 , 5 , 14 , 15 , 15 , 9 , 9 , 16 , 17 , 18 , 10 , 11 , 2 , 3 , 2 , 3 , 19 , 20 , 1 , 1 , 4 , 1 , 4 , 5 , , 1 , 4 , 5 ,
      NPJ Parkinson's Disease
      Nature Publishing Group UK
      Risk factors, Predictive medicine, Genomics, Predictive markers

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Personalized medicine promises individualized disease prediction and treatment. The convergence of machine learning (ML) and available multimodal data is key moving forward. We build upon previous work to deliver multimodal predictions of Parkinson’s disease (PD) risk and systematically develop a model using GenoML, an automated ML package, to make improved multi-omic predictions of PD, validated in an external cohort. We investigated top features, constructed hypothesis-free disease-relevant networks, and investigated drug–gene interactions. We performed automated ML on multimodal data from the Parkinson’s progression marker initiative (PPMI). After selecting the best performing algorithm, all PPMI data was used to tune the selected model. The model was validated in the Parkinson’s Disease Biomarker Program (PDBP) dataset. Our initial model showed an area under the curve (AUC) of 89.72% for the diagnosis of PD. The tuned model was then tested for validation on external data (PDBP, AUC 85.03%). Optimizing thresholds for classification increased the diagnosis prediction accuracy and other metrics. Finally, networks were built to identify gene communities specific to PD. Combining data modalities outperforms the single biomarker paradigm. UPSIT and PRS contributed most to the predictive power of the model, but the accuracy of these are supplemented by many smaller effect transcripts and risk SNPs. Our model is best suited to identifying large groups of individuals to monitor within a health registry or biobank to prioritize for further testing. This approach allows complex predictive models to be reproducible and accessible to the community, with the package, code, and results publicly available.

          Related collections

          Most cited references49

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          limma powers differential expression analyses for RNA-sequencing and microarray studies

          limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            UK Biobank: An Open Access Resource for Identifying the Causes of a Wide Range of Complex Diseases of Middle and Old Age

            Cathie Sudlow and colleagues describe the UK Biobank, a large population-based prospective study, established to allow investigation of the genetic and non-genetic determinants of the diseases of middle and old age.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              From local explanations to global understanding with explainable AI for trees

              Tree-based machine learning models such as random forests, decision trees, and gradient boosted trees are popular non-linear predictive models, yet comparatively little attention has been paid to explaining their predictions. Here, we improve the interpretability of tree-based models through three main contributions: 1) The first polynomial time algorithm to compute optimal explanations based on game theory. 2) A new type of explanation that directly measures local feature interaction effects. 3) A new set of tools for understanding global model structure based on combining many local explanations of each prediction. We apply these tools to three medical machine learning problems and show how combining many high-quality local explanations allows us to represent global structure while retaining local faithfulness to the original model. These tools enable us to i) identify high magnitude but low frequency non-linear mortality risk factors in the US population, ii) highlight distinct population sub-groups with shared risk characteristics, iii) identify non-linear interaction effects among risk factors for chronic kidney disease, and iv) monitor a machine learning model deployed in a hospital by identifying which features are degrading the model’s performance over time. Given the popularity of tree-based machine learning models, these improvements to their interpretability have implications across a broad set of domains. Exact game-theoretic explanations for ensemble tree-based predictions that guarantee desirable properties.
                Bookmark

                Author and article information

                Contributors
                faraz@datatecnica.com
                mike@datatecnica.com
                Journal
                NPJ Parkinsons Dis
                NPJ Parkinsons Dis
                NPJ Parkinson's Disease
                Nature Publishing Group UK (London )
                2373-8057
                1 April 2022
                1 April 2022
                2022
                : 8
                : 35
                Affiliations
                [1 ]GRID grid.94365.3d, ISNI 0000 0001 2297 5165, Laboratory of Neurogenetics, National Institute on Aging, , National Institutes of Health, ; Bethesda, MD USA
                [2 ]GRID grid.83440.3b, ISNI 0000000121901201, Department of Clinical and Movement Neurosciences, , UCL Queen Square Institute of Neurology, ; London, UK
                [3 ]GRID grid.83440.3b, ISNI 0000000121901201, UCL Movement Disorders Centre, , University College London, ; London, UK
                [4 ]GRID grid.94365.3d, ISNI 0000 0001 2297 5165, Center for Alzheimer’s and Related Dementias, , National Institutes of Health, ; Bethesda, MD USA
                [5 ]GRID grid.511118.d, Data Tecnica International LLC, ; Glen Echo, MD USA
                [6 ]GRID grid.424247.3, ISNI 0000 0004 0438 0426, German Center for Neurodegenerative Diseases (DZNE), ; Tübingen, Germany
                [7 ]GRID grid.224260.0, ISNI 0000 0004 0458 8737, School of Nursing, , Virginia Commonwealth University, ; Richmond, VA USA
                [8 ]GRID grid.224260.0, ISNI 0000 0004 0458 8737, Geriatric Pharmacotherapy Program, School of Pharmacy, , Virginia Commonwealth University, ; Richmond, VA USA
                [9 ]GRID grid.35403.31, ISNI 0000 0004 1936 9991, Department of Computer Science, , University of Illinois at Urbana-Champaign, ; Urbana, IL USA
                [10 ]GRID grid.42505.36, ISNI 0000 0001 2156 6853, Institute of Translational Genomics, , University of Southern California, ; Los Angeles, CA USA
                [11 ]GRID grid.250942.8, ISNI 0000 0004 0507 3225, Neurogenomics Division, , Translational Genomics Research Institute (TGen), ; Phoenix, AZ USA
                [12 ]GRID grid.261112.7, ISNI 0000 0001 2173 3359, Khoury College of Computer Sciences, , Northeastern University, ; Boston, MA USA
                [13 ]GRID grid.4868.2, ISNI 0000 0001 2171 1133, Preventive Neurology Unit, Wolfson Institute of Preventive Medicine, , Queen Mary University of London, ; London, UK
                [14 ]GRID grid.213917.f, ISNI 0000 0001 2097 4943, Georgia Institute of Technology, ; Atlanta, GA USA
                [15 ]GRID grid.497059.6, Verily Life Sciences, ; South San Francisco, CA USA
                [16 ]GRID grid.83440.3b, ISNI 0000000121901201, Department of Molecular Neuroscience, , UCL Queen Square Institute of Neurology, ; London, UK
                [17 ]GRID grid.10586.3a, ISNI 0000 0001 2287 8496, Departamento de Ingeniería de la Información y las Comunicaciones, , Universidad de Murcia, ; Murcia, Spain
                [18 ]ModelOp, Chicago, IL USA
                [19 ]GRID grid.511435.7, UK Dementia Research Institute and Department of Neurodegenerative Disease and Reta Lila Weston Institute, ; London, UK
                [20 ]GRID grid.24515.37, ISNI 0000 0004 1937 1450, Institute for Advanced Study, , The Hong Kong University of Science and Technology, ; Hong Kong, Hong Kong SAR, China
                Author information
                http://orcid.org/0000-0002-7978-1051
                http://orcid.org/0000-0002-3754-7777
                http://orcid.org/0000-0003-2040-1955
                http://orcid.org/0000-0002-5473-3774
                http://orcid.org/0000-0001-9358-8111
                http://orcid.org/0000-0001-5606-700X
                http://orcid.org/0000-0001-5744-8728
                Article
                288
                10.1038/s41531-022-00288-w
                8975993
                35365675
                10ed43f1-82ad-43ef-b361-81b8dfa89f0d
                © This is a U.S. government work and not under copyright protection in the U.S.; foreign copyright protection may apply 2022

                Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

                History
                : 1 September 2021
                : 1 February 2022
                Funding
                Funded by: FundRef https://doi.org/10.13039/100000065, U.S. Department of Health & Human Services | NIH | National Institute of Neurological Disorders and Stroke (NINDS);
                Award ID: Z01-AG000949-02
                Award ID: Z01-AG000949-02
                Award ID: Z01-AG000949-02
                Award ID: Z01-AG000949-02
                Award ID: Z01-AG000949-02
                Award ID: Z01-AG000949-02
                Award ID: Z01-AG000949-02
                Award ID: Z01-AG000949-02
                Award ID: Z01-AG000949-02
                Award ID: Z01-AG000949-02
                Award Recipient :
                Funded by: U.S. Department of Health & Human Services | NIH | National Institute of Neurological Disorders and Stroke (NINDS)
                Funded by: U.S. Department of Health & Human Services | NIH | National Institute of Neurological Disorders and Stroke (NINDS)
                Funded by: U.S. Department of Health & Human Services | NIH | National Institute of Neurological Disorders and Stroke (NINDS)
                Funded by: U.S. Department of Health & Human Services | NIH | National Institute of Neurological Disorders and Stroke (NINDS)
                Funded by: U.S. Department of Health & Human Services | NIH | National Institute of Neurological Disorders and Stroke (NINDS)
                Funded by: U.S. Department of Health & Human Services | NIH | National Institute of Neurological Disorders and Stroke (NINDS)
                Funded by: U.S. Department of Health & Human Services | NIH | National Institute of Neurological Disorders and Stroke (NINDS)
                Funded by: GP2
                Funded by: AMP PD
                Funded by: U.S. Department of Health & Human Services | NIH | National Institute of Neurological Disorders and Stroke (NINDS)
                Funded by: U.S. Department of Health & Human Services | NIH | National Institute of Neurological Disorders and Stroke (NINDS)
                Categories
                Article
                Custom metadata
                © The Author(s) 2022

                risk factors,predictive medicine,genomics,predictive markers

                Comments

                Comment on this article