11
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Book Chapter: not found
      Proceedings of the First International Conference on Advanced Data and Information Engineering (DaEng-2013) 

      Data Mining of Protein Sequences with Amino Acid Position-Based Feature Encoding Technique

      other
      , , ,
      Springer Singapore

      Read this book at

      Buy book Bookmark
          There is no author summary for this book yet. Authors can add summaries to their books on ScienceOpen to make them more accessible to a non-specialist audience.

          Related collections

          Most cited references7

          • Record: found
          • Abstract: not found
          • Article: not found

          New techniques for extracting features from protein sequences

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Protein superfamily classification using fuzzy rule-based classifier.

            In this paper, we have proposed a fuzzy rule-based classifier for assigning amino acid sequences into different superfamilies of proteins. While the most popular methods for protein classification rely on sequence alignment, our approach is alignment-free and so more human readable. It accounts for the distribution of contiguous patterns of n amino acids ( n-grams) in the sequences as features, alike other alignment-independent methods. Our approach, first extracts a plenty of features from a set of training sequences, then selects only some best of them, using a proposed feature ranking method. Thereafter, using these features, a novel steady-state genetic algorithm for extracting fuzzy classification rules from data is used to generate a compact set of interpretable fuzzy rules. The generated rules are simple and human understandable. So, the biologists can utilize them, for classification purposes, or incorporate their expertise to interpret or even modify them. To evaluate the performance of our fuzzy rule-based classifier, we have compared it with the conventional nonfuzzy C4.5 algorithm, beside some other fuzzy classifiers. This comparative study is conducted through classifying the protein sequences of five superfamily classes, downloaded from a public domain database. The obtained results show that the generated fuzzy rules are more interpretable, with acceptable improvement in the classification accuracy.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              An efficient technique for superfamily classification of amino acid sequences: feature extraction, fuzzy clustering and prototype selection

                Bookmark

                Author and book information

                Book Chapter
                2014
                December 15 2013
                : 119-126
                10.1007/978-981-4585-18-7_14
                050c335b-1dc1-46da-b27b-46237f02b8ee
                History

                Comments

                Comment on this book

                Book chapters

                Similar content1,214