32
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Universal Entropy of Word Ordering Across Linguistic Families

      research-article
      1 , * , 2
      PLoS ONE
      Public Library of Science

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Background

          The language faculty is probably the most distinctive feature of our species, and endows us with a unique ability to exchange highly structured information. In written language, information is encoded by the concatenation of basic symbols under grammatical and semantic constraints. As is also the case in other natural information carriers, the resulting symbolic sequences show a delicate balance between order and disorder. That balance is determined by the interplay between the diversity of symbols and by their specific ordering in the sequences. Here we used entropy to quantify the contribution of different organizational levels to the overall statistical structure of language.

          Methodology/Principal Findings

          We computed a relative entropy measure to quantify the degree of ordering in word sequences from languages belonging to several linguistic families. While a direct estimation of the overall entropy of language yielded values that varied for the different families considered, the relative entropy quantifying word ordering presented an almost constant value for all those families.

          Conclusions/Significance

          Our results indicate that despite the differences in the structure and vocabulary of the languages analyzed, the impact of word ordering in the structure of language is a statistical linguistic universal.

          Related collections

          Most cited references45

          • Record: found
          • Abstract: found
          • Article: not found

          Long-range correlations in nucleotide sequences.

          DNA sequences have been analysed using models, such as an n-step Markov chain, that incorporate the possibility of short-range nucleotide correlations. We propose here a method for studying the stochastic properties of nucleotide sequences by constructing a 1:1 map of the nucleotide sequence onto a walk, which we term a 'DNA walk'. We then use the mapping to provide a quantitative measure of the correlation between nucleotides over long distances along the DNA chain. Thus we uncover in the nucleotide sequence a remarkably long-range power law correlation that implies a new scale-invariant property of DNA. We find such long-range correlations in intron-containing genes and in nontranscribed regulatory DNA sequences, but not in complementary DNA sequences or intron-less genes.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Prediction and Entropy of Printed English

            C. Shannon (1951)
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Computational and evolutionary aspects of language.

              Language is our legacy. It is the main evolutionary contribution of humans, and perhaps the most interesting trait that has emerged in the past 500 million years. Understanding how darwinian evolution gives rise to human language requires the integration of formal language theory, learning theory and evolutionary dynamics. Formal language theory provides a mathematical description of language and grammar. Learning theory formalizes the task of language acquisition it can be shown that no procedure can learn an unrestricted set of languages. Universal grammar specifies the restricted set of languages learnable by the human brain. Evolutionary dynamics can be formulated to describe the cultural evolution of language and the biological evolution of universal grammar.
                Bookmark

                Author and article information

                Contributors
                Role: Editor
                Journal
                PLoS One
                plos
                plosone
                PLoS ONE
                Public Library of Science (San Francisco, USA )
                1932-6203
                2011
                13 May 2011
                : 6
                : 5
                : e19875
                Affiliations
                [1 ]The University of Manchester, Manchester, United Kingdom
                [2 ]Consejo Nacional de Investigaciones Científicas y Técnicas, Centro Atómico Bariloche and Instituto Balseiro, San Carlos de Bariloche, Argentina
                Queensland Institute of Medical Research, Australia
                Author notes

                Conceived and designed the experiments: MAM DHZ. Performed the experiments: MAM DHZ. Analyzed the data: MAM DHZ. Contributed reagents/materials/analysis tools: MAM DHZ. Wrote the paper: MAM DHZ.

                Article
                PONE-D-10-01197
                10.1371/journal.pone.0019875
                3094390
                21603637
                75acd587-5465-4af7-90ea-b2b318ca02d7
                Montemurro, Zanette. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
                History
                : 24 August 2010
                : 19 April 2011
                Page count
                Pages: 9
                Categories
                Research Article
                Biology
                Computational Biology
                Theoretical Biology
                Engineering
                Signal Processing
                Data Mining
                Statistical Signal Processing
                Mathematics
                Applied Mathematics
                Complex Systems
                Probability Theory
                Statistical Distributions
                Statistics
                Statistical Methods
                Physics
                Interdisciplinary Physics
                Physical Laws and Principles
                Thermodynamics
                Entropy
                Social and Behavioral Sciences
                Linguistics
                Computational Linguistics

                Uncategorized
                Uncategorized

                Comments

                Comment on this article