9
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The Entropy of Words—Learnability and Expressivity across More than 1000 Languages

      , , ,
      Entropy
      MDPI AG

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Related collections

          Most cited references27

          • Record: found
          • Abstract: not found
          • Article: not found

          Prediction and Entropy of Printed English

          C. Shannon (1951)
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found
            Is Open Access

            Compression and communication in the cultural evolution of linguistic structure.

            Language exhibits striking systematic structure. Words are composed of combinations of reusable sounds, and those words in turn are combined to form complex sentences. These properties make language unique among natural communication systems and enable our species to convey an open-ended set of messages. We provide a cultural evolutionary account of the origins of this structure. We show, using simulations of rational learners and laboratory experiments, that structure arises from a trade-off between pressures for compressibility (imposed during learning) and expressivity (imposed during communication). We further demonstrate that the relative strength of these two pressures can be varied in different social contexts, leading to novel predictions about the emergence of structured behaviour in the wild.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Languages cool as they expand: Allometric scaling and the decreasing need for new words

              We analyze the occurrence frequencies of over 15 million words recorded in millions of books published during the past two centuries in seven different languages. For all languages and chronological subsets of the data we confirm that two scaling regimes characterize the word frequency distributions, with only the more common words obeying the classic Zipf law. Using corpora of unprecedented size, we test the allometric scaling relation between the corpus size and the vocabulary size of growing languages to demonstrate a decreasing marginal need for new words, a feature that is likely related to the underlying correlations between words. We calculate the annual growth fluctuations of word use which has a decreasing trend as the corpus size increases, indicating a slowdown in linguistic evolution following language expansion. This “cooling pattern” forms the basis of a third statistical regularity, which unlike the Zipf and the Heaps law, is dynamical in nature.
                Bookmark

                Author and article information

                Journal
                ENTRFG
                Entropy
                Entropy
                MDPI AG
                1099-4300
                June 2017
                June 14 2017
                : 19
                : 6
                : 275
                Article
                10.3390/e19060275
                0cecaa2f-931c-47fb-8a2d-09d645cd13ba
                © 2017

                https://creativecommons.org/licenses/by/4.0/

                History

                Comments

                Comment on this article