Emergent linguistic structure in artificial neural networks trained by self-supervision

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

This paper explores the knowledge of linguistic structure learned by large artificial neural networks, trained via self-supervision, whereby the model simply tries to predict a masked word in a given context. Human language communication is via sequences of words, but language understanding requires constructing rich hierarchical structures that are never observed explicitly. The mechanisms for this have been a prime mystery of human language acquisition, while engineering work has mainly proceeded by supervised learning on treebanks of sentences hand labeled for this latent structure. However, we demonstrate that modern deep contextual language models learn major aspects of this structure, without any explicit supervision. We develop methods for identifying linguistic hierarchical structure emergent in artificial neural networks and demonstrate that components in these models focus on syntactic grammatical relationships and anaphoric coreference. Indeed, we show that a linear transformation of learned embeddings in these models captures parse tree distances to a surprising degree, allowing approximate reconstruction of the sentence tree structures normally assumed by linguists. These results help explain why these models have brought such large improvements across many language-understanding tasks.

Related collections

Most cited references 25

Record: found
Abstract: not found
Conference Proceedings: not found

A Fast and Accurate Dependency Parser using Neural Networks

Danqi Chen, Christopher Manning (2014)

0 comments Cited 178 times – based on 0 reviews

Bookmark

Record: found
Abstract: not found
Article: not found

Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies

Tal Linzen, Emmanuel Dupoux, Yoav Goldberg (2016)

0 comments Cited 100 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Conference Proceedings: not found

What Does BERT Look at? An Analysis of BERT’s Attention

Kevin Clark, Urvashi Khandelwal, Omer Levy … (2019)

0 comments Cited 56 times – based on 0 reviews

Bookmark

All references

Author and article information

Journal

Journal ID (nlm-ta): Proc Natl Acad Sci U S A

Journal ID (iso-abbrev): Proc Natl Acad Sci U S A

Journal ID (hwp): pnas

Journal ID (pmc): pnas

Journal ID (publisher-id): PNAS

Title: Proceedings of the National Academy of Sciences of the United States of America

Publisher: National Academy of Sciences

ISSN (Print): 0027-8424

ISSN (Electronic): 1091-6490

Publication date (Print): 1 December 2020

Publication date (Electronic): 3 June 2020

Publication date PMC-release: 3 June 2020

Volume: 117

Issue: 48

Pages: 30046-30054

Affiliations

[1] ^aComputer Science Department, Stanford University , Stanford, CA 94305;

[2] ^bFacebook Artificial Intelligence Research, Facebook Inc. , Seattle, WA 98109

Author notes

¹To whom correspondence may be addressed. Email: manning@ 123456cs.stanford.edu .

Edited by Matan Gavish, Hebrew University of Jerusalem, Jerusalem, Israel, and accepted by Editorial Board Member David L. Donoho April 13, 2020 (received for review June 3, 2019)

Author contributions: C.D.M., K.C., J.H., U.K., and O.L. designed research; K.C., J.H., and U.K. performed research; and C.D.M., K.C., J.H., U.K., and O.L. wrote the paper.

Author information

Christopher D. Manning http://orcid.org/0000-0001-6155-649X

John Hewitt http://orcid.org/0000-0003-1320-6633

Article

Accession ID: PMC7720155 Pmcid ID: PMC7720155 Pmc-uid ID: 7720155 Publisher ID: 201907367

DOI: 10.1073/pnas.1907367117

PMC ID: 7720155

PubMed ID: 32493748

SO-VID: b8a0313a-af22-4278-82a7-be5e97bf1d50

License:

Published under the PNAS license.

History

Page count

Pages: 10

Funding

Funded by: Tencent Corp.

Award ID: gift

Award Recipient : Christopher D Manning Award Recipient : John Hewitt

Funded by: Google 100006785

Award ID: fellowship

Award Recipient : Kevin Clark

Custom metadata

access-type free

Keywords: syntax,artificial neural netwok,learning,self-supervision

Data availability:

Keywords: syntax, artificial neural netwok, learning, self-supervision

Comments

Comment on this article

scite_

Cited by 43

See all cited by

Most referenced authors 298

See all reference authors

- Version 1

Emergent linguistic structure in artificial neural networks trained by self-supervision

Read this article at

Abstract

Related collections

Artificial Intelligence in Medicine

Most cited references 25

A Fast and Accurate Dependency Parser using Neural Networks

Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies

What Does BERT Look at? An Analysis of BERT’s Attention

Author and article information

Journal

Affiliations

Author notes

Author information

Article

History

Page count

Funding

Categories

Custom metadata

Comments

Comment on this article

Similar content 203

Cited by 43

Most referenced authors 298