Natural language processing: state of the art, current trends and challenges

Khurana, Diksha; Koli, Aditya; Khatter, Kiran; Singh, Sukhdev

doi:10.1007/s11042-022-13428-4

ScienceOpen: research and publishing network

For Publishers

For Researchers

Blog
About

Search
Advanced search

views

recommends

Record: found
Abstract: found
Article: not found

Natural language processing: state of the art, current trends and challenges

research-article

Author(s): Diksha Khurana ¹ , Aditya Koli ¹ , Kiran Khatter ² ^, , Sukhdev Singh ³

Publication date (Electronic): 14 July 2022

Journal: Multimedia Tools and Applications

Publisher: Springer US

Keywords: Natural language processing, Natural language understanding, Natural language generation, NLP applications, NLP evaluation metrics

Read this article at

ScienceOpenPublisher PMC

Bookmark

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Natural language processing (NLP) has recently gained much attention for representing and analyzing human language computationally. It has spread its applications in various fields such as machine translation, email spam detection, information extraction, summarization, medical, and question answering etc. In this paper, we first distinguish four phases by discussing different levels of NLP and components of Natural Language Generation followed by presenting the history and evolution of NLP. We then discuss in detail the state of the art presenting the various applications of NLP, current trends, and challenges. Finally, we present a discussion on some available datasets, models, and evaluation metrics in NLP.

Related collections

Most cited references 62

Record: found
Abstract: found
Article: not found

Long Short-Term Memory

Jürgen Schmidhuber, Jürgen Schmidhuber (2002)

Learning to store information over extended time intervals by recurrent backpropagation takes a very long time, mostly because of insufficient, decaying error backflow. We briefly review Hochreiter's (1991) analysis of this problem, then address it by introducing a novel, efficient, gradient-based method called long short-term memory (LSTM). Truncating the gradient where this does not do harm, LSTM can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units. Multiplicative gate units learn to open and close access to the constant error flow. LSTM is local in space and time; its computational complexity per time step and weight is O(1). Our experiments with artificial data involve local, distributed, real-valued, and noisy pattern representations. In comparisons with real-time recurrent learning, back propagation through time, recurrent cascade correlation, Elman nets, and neural sequence chunking, LSTM leads to many more successful runs, and learns much faster. LSTM also solves complex, artificial long-time-lag tasks that have never been solved by previous recurrent network algorithms.

0 comments Cited 6922 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

LSTM: A Search Space Odyssey

Bas Steunebrink, Jan Koutník, Rupesh K Srivastava … (2016)

Several variants of the long short-term memory (LSTM) architecture for recurrent neural networks have been proposed since its inception in 1995. In recent years, these networks have become the state-of-the-art models for a variety of machine learning problems. This has led to a renewed interest in understanding the role and utility of various computational components of typical LSTM variants. In this paper, we present the first large-scale analysis of eight LSTM variants on three representative tasks: speech recognition, handwriting recognition, and polyphonic music modeling. The hyperparameters of all LSTM variants for each task were optimized separately using random search, and their importance was assessed using the powerful functional ANalysis Of VAriance framework. In total, we summarize the results of 5400 experimental runs ( ≈ 15 years of CPU time), which makes our study the largest of its kind on LSTM networks. Our results show that none of the variants can improve upon the standard LSTM architecture significantly, and demonstrate the forget gate and the output activation function to be its most critical components. We further observe that the studied hyperparameters are virtually independent and derive guidelines for their efficient adjustment.

0 comments Cited 771 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Pfam: multiple sequence alignments and HMM-profiles of protein domains.

E. Sonnhammer, S. Eddy, E. Birney … (1998)

Pfam contains multiple alignments and hidden Markov model based profiles (HMM-profiles) of complete protein domains. The definition of domain boundaries, family members and alignment is done semi-automatically based on expert knowledge, sequence similarity, other protein family databases and the ability of HMM-profiles to correctly identify and align the members. Release 2.0 of Pfam contains 527 manually verified families which are available for browsing and on-line searching via the World Wide Web in the UK at http://www.sanger.ac.uk/Pfam/ and in the US at http://genome.wustl. edu/Pfam/ Pfam 2.0 matches one or more domains in 50% of Swissprot-34 sequences, and 25% of a large sample of predicted proteins from the Caenorhabditis elegans genome.

0 comments Cited 185 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Diksha Khurana: dikshakhurana0509@gmail.com

Aditya Koli: adityakoli0010@gmail.com

Kiran Khatter:

ORCID: http://orcid.org/0000-0002-1000-6102

kirankhatter@gmail.com

Sukhdev Singh: sukhdev200@gmail.com

Journal

Journal ID (nlm-ta): Multimed Tools Appl

Journal ID (iso-abbrev): Multimed Tools Appl

Title: Multimedia Tools and Applications

Publisher: Springer US (New York )

ISSN (Print): 1380-7501

ISSN (Electronic): 1573-7721

Publication date (Electronic): 14 July 2022

Pages: 1-32

Affiliations

[1 ]GRID grid.449068.7, ISNI 0000 0004 1774 4313, Department of Computer Science, , Manav Rachna International Institute of Research and Studies, ; Faridabad, India

[2 ]GRID grid.499297.8, ISNI 0000000448833810, Department of Computer Science, , BML Munjal University, ; Gurgaon, India

[3 ]Department of Statistics, Amity University Punjab, Mohali, India

Author information

Kiran Khatter http://orcid.org/0000-0002-1000-6102

Article

Publisher ID: 13428

DOI: 10.1007/s11042-022-13428-4

PMC ID: 9281254

PubMed ID: 35855771

SO-VID: d3e5ad3a-13eb-46a4-a47b-f337b3c9d18d

License:

This article is made available via the PMC Open Access Subset for unrestricted research re-use and secondary analysis in any form or by any means with acknowledgement of the original source. These permissions are granted for the duration of the World Health Organization (WHO) declaration of COVID-19 as a global pandemic.

History

Date received : 3 February 2021

Date revision received : 23 March 2022

Date accepted : 2 July 2022

Comments

Comment on this article

scite_

Cited by 61

See all cited by

Natural language processing: state of the art, current trends and challenges

Read this article at

Abstract

Related collections

Radiology and Natural Language Processing

Most cited references 62

Long Short-Term Memory

LSTM: A Search Space Odyssey

Pfam: multiple sequence alignments and HMM-profiles of protein domains.

Author and article information

Contributors

Journal

Affiliations

Author information

Article

History

Categories

Comments

Comment on this article

Similar content 52

Cited by 61