4
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Natural language processing: state of the art, current trends and challenges

      research-article

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Natural language processing (NLP) has recently gained much attention for representing and analyzing human language computationally. It has spread its applications in various fields such as machine translation, email spam detection, information extraction, summarization, medical, and question answering etc. In this paper, we first distinguish four phases by discussing different levels of NLP and components of Natural Language Generation followed by presenting the history and evolution of NLP. We then discuss in detail the state of the art presenting the various applications of NLP, current trends, and challenges. Finally, we present a discussion on some available datasets, models, and evaluation metrics in NLP.

          Related collections

          Most cited references62

          • Record: found
          • Abstract: found
          • Article: not found

          Long Short-Term Memory

          Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient-based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O(1). Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            LSTM: A Search Space Odyssey

            Several variants of the long short-term memory (LSTM) architecture for recurrent neural networks have been proposed since its inception in 1995. In recent years, these networks have become the state-of-the-art models for a variety of machine learning problems. This has led to a renewed interest in understanding the role and utility of various computational components of typical LSTM variants. In this paper, we present the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling. The hyperparameters of all LSTM variants for each task were optimized separately using random search, and their importance was assessed using the powerful functional ANalysis Of VAriance framework. In total, we summarize the results of 5400 experimental runs ( ≈ 15 years of CPU time), which makes our study the largest of its kind on LSTM networks. Our results show that none of the variants can improve upon the standard LSTM architecture significantly, and demonstrate the forget gate and the output activation function to be its most critical components. We further observe that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Pfam: multiple sequence alignments and HMM-profiles of protein domains.

              Pfam contains multiple alignments and hidden Markov model based profiles (HMM-profiles) of complete protein domains. The definition of domain boundaries, family members and alignment is done semi-automatically based on expert knowledge, sequence similarity, other protein family databases and the ability of HMM-profiles to correctly identify and align the members. Release 2.0 of Pfam contains 527 manually verified families which are available for browsing and on-line searching via the World Wide Web in the UK at http://www.sanger.ac.uk/Pfam/ and in the US at http://genome.wustl. edu/Pfam/ Pfam 2.0 matches one or more domains in 50% of Swissprot-34 sequences, and 25% of a large sample of predicted proteins from the Caenorhabditis elegans genome.
                Bookmark

                Author and article information

                Contributors
                dikshakhurana0509@gmail.com
                adityakoli0010@gmail.com
                kirankhatter@gmail.com
                sukhdev200@gmail.com
                Journal
                Multimed Tools Appl
                Multimed Tools Appl
                Multimedia Tools and Applications
                Springer US (New York )
                1380-7501
                1573-7721
                14 July 2022
                : 1-32
                Affiliations
                [1 ]GRID grid.449068.7, ISNI 0000 0004 1774 4313, Department of Computer Science, , Manav Rachna International Institute of Research and Studies, ; Faridabad, India
                [2 ]GRID grid.499297.8, ISNI 0000000448833810, Department of Computer Science, , BML Munjal University, ; Gurgaon, India
                [3 ]Department of Statistics, Amity University Punjab, Mohali, India
                Author information
                http://orcid.org/0000-0002-1000-6102
                Article
                13428
                10.1007/s11042-022-13428-4
                9281254
                35855771
                d3e5ad3a-13eb-46a4-a47b-f337b3c9d18d
                © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2022

                This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.

                History
                : 3 February 2021
                : 23 March 2022
                : 2 July 2022
                Categories
                Article

                Graphics & Multimedia design
                natural language processing,natural language understanding,natural language generation,nlp applications,nlp evaluation metrics

                Comments

                Comment on this article