8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Emergent linguistic structure in artificial neural networks trained by self-supervision

      research-article

      Read this article at

      ScienceOpenPublisherPMC
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          This paper explores the knowledge of linguistic structure learned by large artificial neural networks, trained via self-supervision, whereby the model simply tries to predict a masked word in a given context. Human language communication is via sequences of words, but language understanding requires constructing rich hierarchical structures that are never observed explicitly. The mechanisms for this have been a prime mystery of human language acquisition, while engineering work has mainly proceeded by supervised learning on treebanks of sentences hand labeled for this latent structure. However, we demonstrate that modern deep contextual language models learn major aspects of this structure, without any explicit supervision. We develop methods for identifying linguistic hierarchical structure emergent in artificial neural networks and demonstrate that components in these models focus on syntactic grammatical relationships and anaphoric coreference. Indeed, we show that a linear transformation of learned embeddings in these models captures parse tree distances to a surprising degree, allowing approximate reconstruction of the sentence tree structures normally assumed by linguists. These results help explain why these models have brought such large improvements across many language-understanding tasks.

          Related collections

          Most cited references25

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          A Fast and Accurate Dependency Parser using Neural Networks

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies

              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              What Does BERT Look at? An Analysis of BERT’s Attention

                Bookmark

                Author and article information

                Journal
                Proc Natl Acad Sci U S A
                Proc Natl Acad Sci U S A
                pnas
                pnas
                PNAS
                Proceedings of the National Academy of Sciences of the United States of America
                National Academy of Sciences
                0027-8424
                1091-6490
                1 December 2020
                3 June 2020
                3 June 2020
                : 117
                : 48
                : 30046-30054
                Affiliations
                [1] aComputer Science Department, Stanford University , Stanford, CA 94305;
                [2] bFacebook Artificial Intelligence Research, Facebook Inc. , Seattle, WA 98109
                Author notes
                1To whom correspondence may be addressed. Email: manning@ 123456cs.stanford.edu .

                Edited by Matan Gavish, Hebrew University of Jerusalem, Jerusalem, Israel, and accepted by Editorial Board Member David L. Donoho April 13, 2020 (received for review June 3, 2019)

                Author contributions: C.D.M., K.C., J.H., U.K., and O.L. designed research; K.C., J.H., and U.K. performed research; and C.D.M., K.C., J.H., U.K., and O.L. wrote the paper.

                Author information
                http://orcid.org/0000-0001-6155-649X
                http://orcid.org/0000-0003-1320-6633
                Article
                PMC7720155 PMC7720155 7720155 201907367
                10.1073/pnas.1907367117
                7720155
                32493748
                b8a0313a-af22-4278-82a7-be5e97bf1d50
                Copyright @ 2020

                Published under the PNAS license.

                History
                Page count
                Pages: 10
                Funding
                Funded by: Tencent Corp.
                Award ID: gift
                Award Recipient : Christopher D Manning Award Recipient : John Hewitt
                Funded by: Google 100006785
                Award ID: fellowship
                Award Recipient : Kevin Clark
                Categories
                529
                Colloquium on the Science of Deep Learning
                Physical Sciences
                Computer Sciences
                Custom metadata
                free

                syntax,artificial neural netwok,learning,self-supervision

                Comments

                Comment on this article