Two biomedical sublanguages: a description based on the theories of Zellig Harris

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Natural language processing (NLP) systems have been developed to provide access to the tremendous body of data and knowledge that is available in the biomedical domain in the form of natural language text. These NLP systems are valuable because they can encode and amass the information in the text so that it can be used by other automated processes to improve patient care and our understanding of disease processes and treatments. Zellig Harris proposed a theory of sublanguage that laid the foundation for natural language processing in specialized domains. He hypothesized that the informational content and structure form a specialized language that can be delineated in the form of a sublanguage grammar. The grammar can then be used by a language processor to capture and encode the salient information and relations in text. In this paper, we briefly summarize his language and sublanguage theories. In addition, we summarize our prior research, which is associated with the sublanguage grammars we developed for two different biomedical domains. These grammars illustrate how Harris' theories provide a basis for the development of language processing systems in the biomedical domain. The two domains and their associated sublanguages discussed are: the clinical domain, where the text consists of patient reports, and the biomolecular domain, where the text consists of complete journal articles.

Related collections

Author and article information

Journal

Title: Journal of Biomedical Informatics

Abbreviated Title: Journal of Biomedical Informatics

Publisher: Elsevier BV

ISSN (Print): 15320464

Publication date Created: August 2002

Publication date (Print): August 2002

Volume: 35

Issue: 4

Pages: 222-235

Article

DOI: 10.1016/S1532-0464(03)00012-1

PubMed ID: 12755517

SO-VID: 5f24ca27-c0a1-4894-aae1-e8d9513ca9c1

License:

https://www.elsevier.com/tdm/userlicense/1.0/

https://www.elsevier.com/open-access/userlicense/1.0/

History

Data availability:

Two biomedical sublanguages: a description based on the theories of Zellig Harris

Read this article at

Abstract

Related collections

Electrospinning for biomedical applications

Author and article information

Journal

Article

History

Comments

Comment on this article

Similar content 4,792

Cited by 53