ScienceOpen: research and publishing network

For Researchers

Search
Advanced search

7

views

    

0

recommends

0

shares

Record: found
Abstract: found
Article: not found

Emergent linguistic structure in artificial neural networks trained by self-supervision

Author(s): Christopher D. Manning , Kevin Clark , John Hewitt , Urvashi Khandelwal , Omer Levy

Publication date (Electronic): June 03 2020

Journal: Proceedings of the National Academy of Sciences

Publisher: Proceedings of the National Academy of Sciences

Read this article at

ScienceOpenPublisher PMC

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

This paper explores the knowledge of linguistic structure learned by large artificial neural networks, trained via self-supervision, whereby the model simply tries to predict a masked word in a given context. Human language communication is via sequences of words, but language understanding requires constructing rich hierarchical structures that are never observed explicitly. The mechanisms for this have been a prime mystery of human language acquisition, while engineering work has mainly proceeded by supervised learning on treebanks of sentences hand labeled for this latent structure. However, we demonstrate that modern deep contextual language models learn major aspects of this structure, without any explicit supervision. We develop methods for identifying linguistic hierarchical structure emergent in artificial neural networks and demonstrate that components in these models focus on syntactic grammatical relationships and anaphoric coreference. Indeed, we show that a linear transformation of learned embeddings in these models captures parse tree distances to a surprising degree, allowing approximate reconstruction of the sentence tree structures normally assumed by linguists. These results help explain why these models have brought such large improvements across many language-understanding tasks.

Related collections

Most cited references 25

Record: found
Abstract: not found
Conference Proceedings: not found

A Fast and Accurate Dependency Parser using Neural Networks

Danqi Chen, Christopher Manning (2014)

0 comments Cited 177 times – based on 0 reviews

Record: found
Abstract: not found
Article: not found

Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies

Tal Linzen, Emmanuel Dupoux, Yoav Goldberg (2016)

0 comments Cited 100 times – based on 0 reviews      Review now

Record: found
Abstract: not found
Conference Proceedings: not found

What Does BERT Look at? An Analysis of BERT’s Attention

Kevin Clark, Urvashi Khandelwal, Omer Levy … (2019)

0 comments Cited 56 times – based on 0 reviews

Author and article information

Journal

Title: Proceedings of the National Academy of Sciences

Abbreviated Title: Proc Natl Acad Sci USA

Publisher: Proceedings of the National Academy of Sciences

ISSN (Print): 0027-8424

ISSN (Electronic): 1091-6490

Publication date (Electronic): June 03 2020

Page: 201907367

Article

DOI: 10.1073/pnas.1907367117

PMC ID: 7720155

PubMed ID: 32493748

SO-VID: b8a0313a-af22-4278-82a7-be5e97bf1d50

Copyright © © 2020

License:

Free to read

https://www.pnas.org/site/aboutpnas/licenses.xhtml

History

Data availability:

Comments

Comment on this article

scite_

Similar content 213

See all similar

Cited by 42

See all cited by

- Version 1