6
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      AMAnD: an automated metagenome anomaly detection methodology utilizing DeepSVDD neural networks

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The composition of metagenomic communities within the human body often reflects localized medical conditions such as upper respiratory diseases and gastrointestinal diseases. Fast and accurate computational tools to flag anomalous metagenomic samples from typical samples are desirable to understand different phenotypes, especially in contexts where repeated, long-duration temporal sampling is done. Here, we present Automated Metagenome Anomaly Detection (AMAnD), which utilizes two types of Deep Support Vector Data Description (DeepSVDD) models; one trained on taxonomic feature space output by the Pan-Genomics for Infectious Agents (PanGIA) taxonomy classifier and one trained on kmer frequency counts. AMAnD's semi-supervised one-class approach makes no assumptions about what an anomaly may look like, allowing the flagging of potentially novel anomaly types. Three diverse datasets are profiled. The first dataset is hosted on the National Center for Biotechnology Information's (NCBI) Sequence Read Archive (SRA) and contains nasopharyngeal swabs from healthy and COVID-19-positive patients. The second dataset is also hosted on SRA and contains gut microbiome samples from normal controls and from patients with slow transit constipation (STC). AMAnD can learn a typical healthy nasopharyngeal or gut microbiome profile and reliably flag the anomalous COVID+ or STC samples in both feature spaces. The final dataset is a synthetic metagenome created by the Critical Assessment of Metagenome Annotation Simulator (CAMISIM). A control dataset of 50 well-characterized organisms was submitted to CAMISIM to generate 100 synthetic control class samples. The experimental conditions included 12 different spiked-in contaminants that are taxonomically similar to organisms present in the laboratory blank sample ranging from one strain tree branch taxonomic distance away to one family tree branch taxonomic distance away. This experiment was repeated in triplicate at three different coverage levels to probe the dependence on sample coverage. AMAnD was again able to flag the contaminant inserts as anomalous. AMAnD's assumption-free flagging of metagenomic anomalies, the real-time model training update potential of the deep learning approach, and the strong performance even with lightweight models of low sample cardinality would make AMAnD well-suited to a wide array of applied metagenomics biosurveillance use-cases, from environmental to clinical utility.

          Related collections

          Most cited references29

          • Record: found
          • Abstract: found
          • Article: not found

          A fast, lock-free approach for efficient parallel counting of occurrences of k-mers.

          Counting the number of occurrences of every k-mer (substring of length k) in a long string is a central subproblem in many applications, including genome assembly, error correction of sequencing reads, fast multiple sequence alignment and repeat detection. Recently, the deep sequence coverage generated by next-generation sequencing technologies has caused the amount of sequence to be processed during a genome project to grow rapidly, and has rendered current k-mer counting tools too slow and memory intensive. At the same time, large multicore computers have become commonplace in research facilities allowing for a new parallel computational paradigm. We propose a new k-mer counting algorithm and associated implementation, called Jellyfish, which is fast and memory efficient. It is based on a multithreaded, lock-free hash table optimized for counting k-mers up to 31 bases in length. Due to their flexibility, suffix arrays have been the data structure of choice for solving many string problems. For the task of k-mer counting, important in many biological applications, Jellyfish offers a much faster and more memory-efficient solution. The Jellyfish software is written in C++ and is GPL licensed. It is available for download at http://www.cbcb.umd.edu/software/jellyfish.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Clinical metagenomics

            Clinical metagenomic next-generation sequencing (mNGS), the comprehensive analysis of microbial and host genetic material (DNA and RNA) in samples from patients, is rapidly moving from research to clinical laboratories. This emerging approach is changing how physicians diagnose and treat infectious disease, with applications spanning a wide range of areas, including antimicrobial resistance, the microbiome, human host gene expression (transcriptomics) and oncology. Here, we focus on the challenges of implementing mNGS in the clinical laboratory and address potential solutions for maximizing its impact on patient care and public health.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Soil microbiomes and climate change

              The soil microbiome governs biogeochemical cycling of macronutrients, micronutrients and other elements vital for the growth of plants and animal life. Understanding and predicting the impact of climate change on soil microbiomes and the ecosystem services they provide present a grand challenge and major opportunity as we direct our research efforts towards one of the most pressing problems facing our planet. In this Review, we explore the current state of knowledge about the impacts of climate change on soil microorganisms in different climate-sensitive soil ecosystems, as well as potential ways that soil microorganisms can be harnessed to help mitigate the negative consequences of climate change.
                Bookmark

                Author and article information

                Contributors
                Journal
                Front Public Health
                Front Public Health
                Front. Public Health
                Frontiers in Public Health
                Frontiers Media S.A.
                2296-2565
                11 July 2023
                2023
                : 11
                : 1181911
                Affiliations
                Life Science Resource Center, MRIGlobal , Gaithersburg, MD, United States
                Author notes

                Edited by: Matthew Horsley, Lawrence Livermore National Security, United States

                Reviewed by: Sunita Kamboj, Argonne National Laboratory (DOE), United States; Jie Ren, Google, United States

                *Correspondence: Colin Price cprice@ 123456mriglobal.org
                Article
                10.3389/fpubh.2023.1181911
                10368493
                55946af6-1c1e-4095-b0da-1328436f9310
                Copyright © 2023 Price and Russell.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

                History
                : 08 March 2023
                : 12 June 2023
                Page count
                Figures: 7, Tables: 0, Equations: 0, References: 29, Pages: 10, Words: 6281
                Categories
                Public Health
                Technology and Code
                Custom metadata
                Radiation and Health

                anomaly detection,metagenomics,deep learning,deepsvdd,machine learning

                Comments

                Comment on this article

                scite_
                0
                0
                0
                0
                Smart Citations
                0
                0
                0
                0
                Citing PublicationsSupportingMentioningContrasting
                View Citations

                See how this article has been cited at scite.ai

                scite shows how a scientific paper has been cited by providing the context of the citation, a classification describing whether it supports, mentions, or contrasts the cited claim, and a label indicating in which section the citation was made.

                Similar content71

                Most referenced authors444