16
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Feature versus Raw Sequence: Deep Learning Comparative Study on Predicting Pre-miRNA

      Preprint
      , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Should we input known genome sequence features or input sequence itself in deep learning framework? As deep learning more popular in various applications, researchers often come to question whether to generate features or use raw sequences for deep learning. To answer this question, we study the prediction accuracy of precursor miRNA prediction of feature-based deep belief network and sequence-based convolution neural network. Tested on a variant of six-layer convolution neural net and three-layer deep belief network, we find the raw sequence input based convolution neural network model performs similar or slightly better than feature based deep belief networks with best accuracy values of 0.995 and 0.990, respectively. Both the models outperform existing benchmarks models. The results shows us that if provided large enough data, well devised raw sequence based deep learning models can replace feature based deep learning models. However, construction of well behaved deep learning model can be very challenging. In cased features can be easily extracted, feature-based deep learning models may be a better alternative.

          Related collections

          Most cited references14

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Practical Aspects of microRNA Target Prediction

          microRNAs (miRNAs) are endogenous non-coding RNAs that control gene expression at the posttranscriptional level. These small regulatory molecules play a key role in the majority of biological processes and their expression is also tightly regulated. Both the deregulation of genes controlled by miRNAs and the altered miRNA expression have been linked to many disorders, including cancer, cardiovascular, metabolic and neurodegenerative diseases. Therefore, it is of particular interest to reliably predict potential miRNA targets which might be involved in these diseases. However, interactions between miRNAs and their targets are complex and very often there are numerous putative miRNA recognition sites in mRNAs. Many miRNA targets have been computationally predicted but only a limited number of these were experimentally validated. Although a variety of miRNA target prediction algorithms are available, results of their application are often inconsistent. Hence, finding a functional miRNA target is still a challenging task. In this review, currently available and frequently used computational tools for miRNA target prediction, i.e., PicTar, TargetScan, DIANA-microT, miRanda, rna22 and PITA are outlined and various practical aspects of miRNA target analysis are extensively discussed. Moreover, the performance of three algorithms (PicTar, TargetScan and DIANA-microT) is both demonstrated and evaluated by performing an in-depth analysis of miRNA interactions with mRNAs derived from genes triggering hereditary neurological disorders known as trinucleotide repeat expansion diseases (TREDs), such as Huntington’s disease (HD), a number of spinocerebellar ataxias (SCAs), and myotonic dystrophy type 1 (DM1).
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            The developmental miRNA profiles of zebrafish as determined by small RNA cloning.

            MicroRNAs (miRNAs) represent a family of small, regulatory, noncoding RNAs that are found in plants and animals. Here, we describe the miRNA profile of the zebrafish Danio rerio resolved in a developmental and cell-type-specific manner. The profiles were obtained from larger-scale sequencing of small RNA libraries prepared from developmentally staged zebrafish, and two adult fibroblast cell lines derived from the caudal fin (ZFL) and the liver epithelium (SJD). We identified a total of 154 distinct miRNAs expressed from 343 miRNA genes. Other experimental/computational sources support an additional 10 miRNAs encoded by 19 genes. The miRNAs can be classified into 87 distinct families. Cross-species comparison indicates that 81 families are conserved in mammals, 17 of which also have at least one member conserved in an invertebrate. Our analysis reveals that the zygotes are essentially devoid of miRNAs and that their expression begins during the blastula period with a zebrafish-specific family of miRNAs encoded by closely spaced multicopy genes. Computational predictions of zebrafish miRNA targets are provided that take into account the depth of evolutionary conservation. Besides miRNAs, we identified a prominent class of repeat-associated small interfering RNAs (rasiRNAs).
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Cluster-based under-sampling approaches for imbalanced data distributions

                Bookmark

                Author and article information

                Journal
                17 October 2017
                Article
                1710.06798
                d90bc2e7-29b6-464d-b00d-bb3358583fee

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                12 pages, 2 figures. arXiv admin note: substantial text overlap with arXiv:1704.03834
                cs.LG q-bio.GN

                Comments

                Comment on this article