14
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      NanoDeep: a deep learning framework for nanopore adaptive sampling on microbial sequencing

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Nanopore sequencers can enrich or deplete the targeted DNA molecules in a library by reversing the voltage across individual nanopores. However, it requires substantial computational resources to achieve rapid operations in parallel at read-time sequencing. We present a deep learning framework, NanoDeep, to overcome these limitations by incorporating convolutional neural network and squeeze and excitation. We first showed that the raw squiggle derived from native DNA sequences determines the origin of microbial and human genomes. Then, we demonstrated that NanoDeep successfully classified bacterial reads from the pooled library with human sequence and showed enrichment for bacterial sequence compared with routine nanopore sequencing setting. Further, we showed that NanoDeep improves the sequencing efficiency and preserves the fidelity of bacterial genomes in the mock sample. In addition, NanoDeep performs well in the enrichment of metagenome sequences of gut samples, showing its potential applications in the enrichment of unknown microbiota. Our toolkit is available at https://github.com/lysovosyl/NanoDeep.

          Related collections

          Most cited references44

          • Record: found
          • Abstract: found
          • Article: not found

          Long Short-Term Memory

          Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient-based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O(1). Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            Performance of neural network basecalling tools for Oxford Nanopore sequencing

            Background Basecalling, the computational process of translating raw electrical signal to nucleotide sequence, is of critical importance to the sequencing platforms produced by Oxford Nanopore Technologies (ONT). Here, we examine the performance of different basecalling tools, looking at accuracy at the level of bases within individual reads and at majority-rule consensus basecalls in an assembly. We also investigate some additional aspects of basecalling: training using a taxon-specific dataset, using a larger neural network model and improving consensus basecalls in an assembly by additional signal-level analysis with Nanopolish. Results Training basecallers on taxon-specific data results in a significant boost in consensus accuracy, mostly due to the reduction of errors in methylation motifs. A larger neural network is able to improve both read and consensus accuracy, but at a cost to speed. Improving consensus sequences (‘polishing’) with Nanopolish somewhat negates the accuracy differences in basecallers, but pre-polish accuracy does have an effect on post-polish accuracy. Conclusions Basecalling accuracy has seen significant improvements over the last 2 years. The current version of ONT’s Guppy basecaller performs well overall, with good accuracy and fast performance. If higher accuracy is required, users should consider producing a custom model using a larger neural network and/or training data from the same species. Electronic supplementary material The online version of this article (10.1186/s13059-019-1727-y) contains supplementary material, which is available to authorized users.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Nanopore sequencing technology, bioinformatics and applications

                Bookmark

                Author and article information

                Contributors
                Journal
                Brief Bioinform
                Brief Bioinform
                bib
                Briefings in Bioinformatics
                Oxford University Press
                1467-5463
                1477-4054
                January 2024
                06 January 2024
                06 January 2024
                : 25
                : 1
                : bbad499
                Affiliations
                Dermatology Hospital, Southern Medical University , Guangzhou, China
                Dermatology Hospital, Southern Medical University , Guangzhou, China
                Dermatology Hospital, Southern Medical University , Guangzhou, China
                Dermatology Hospital, Southern Medical University , Guangzhou, China
                Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Prince of Wales Hospital , Shatin, New Territories, Hong Kong SAR, China
                Department of Chemical Pathology, The Chinese University of Hong Kong, Prince of Wales Hospital , Shatin, New Territories, Hong Kong SAR, China
                Dermatology Hospital, Southern Medical University , Guangzhou, China
                Dermatology Hospital, Southern Medical University , Guangzhou, China
                Dermatology Hospital, Southern Medical University , Guangzhou, China
                Li Ka Shing Institute of Health Sciences, The Chinese University of Hong Kong, Prince of Wales Hospital , Shatin, New Territories, Hong Kong SAR, China
                Department of Chemical Pathology, The Chinese University of Hong Kong, Prince of Wales Hospital , Shatin, New Territories, Hong Kong SAR, China
                Dermatology Hospital, Southern Medical University , Guangzhou, China
                Dermatology Hospital, Southern Medical University , Guangzhou, China
                Author notes
                Corresponding authors: Jiajian Zhou, Dermatology Hospital, Southern Medical University, Block No. 2, Lujing Road, Yuexiu district, Guangzhou, Guangdong 510091, China. Tel.: 020-83707732; E-mail: zhoujj2013@ 123456smu.edu.cn ; Yuhui Liao, Molecular Diagnosis and Treatment Center for Infectious Diseases, Dermatology Hospital, Southern Medical University, Block No. 2, Lujing Road, Yuexiu district, Guangzhou, Guangdong 510091, China. Tel./Fax: 020-83707732; E-mail: liaoyh8@ 123456mail.sysu.edu.cn

                Yusen Lin and Yongjun Zhang contributed equally to this work and share first authorship.

                Author information
                https://orcid.org/0000-0001-7800-4085
                https://orcid.org/0000-0002-5547-9501
                https://orcid.org/0000-0001-5206-0791
                Article
                bbad499
                10.1093/bib/bbad499
                10772945
                38189540
                979e7a61-4c0f-47fd-bf4c-322c8d71e6a2
                © The Author(s) 2024. Published by Oxford University Press.

                This is an Open Access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

                History
                : 30 July 2023
                : 21 November 2023
                : 11 December 2023
                Page count
                Pages: 11
                Funding
                Funded by: National Natural Science Foundation of China, DOI 10.13039/501100001809;
                Award ID: 31900447
                Award ID: 32070792
                Funded by: Startup Foundation of Dermatology Hospital, Southern Medical University;
                Award ID: 2019RC06
                Funded by: State Key Development Program;
                Funded by: Ministry of Science and Technology of China;
                Award ID: 2021YFC2302200
                Funded by: Hua Run fund of Joint Laboratory of Dermatology Hospital, Southern Medical University and China Resources Sanjiu Medical & Pharmaceutical;
                Award ID: HR202108
                Categories
                Problem Solving Protocol
                AcademicSubjects/SCI01060

                Bioinformatics & Computational biology
                adaptive sampling,machine learning,nanopore sequencing,convolutional neural network,metagenomic sequencing

                Comments

                Comment on this article