40
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A Topic Modeling Comparison Between LDA, NMF, Top2Vec, and BERTopic to Demystify Twitter Posts

      methods-article
      1 , 2 , * ,
      Frontiers in Sociology
      Frontiers Media S.A.
      topic model, machine learning, LDA, Top2Vec, BERTopic, NMF, Twitter, covid travel

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The richness of social media data has opened a new avenue for social science research to gain insights into human behaviors and experiences. In particular, emerging data-driven approaches relying on topic models provide entirely new perspectives on interpreting social phenomena. However, the short, text-heavy, and unstructured nature of social media content often leads to methodological challenges in both data collection and analysis. In order to bridge the developing field of computational science and empirical social research, this study aims to evaluate the performance of four topic modeling techniques; namely latent Dirichlet allocation (LDA), non-negative matrix factorization (NMF), Top2Vec, and BERTopic. In view of the interplay between human relations and digital media, this research takes Twitter posts as the reference point and assesses the performance of different algorithms concerning their strengths and weaknesses in a social science context. Based on certain details during the analytical procedures and on quality issues, this research sheds light on the efficacy of using BERTopic and NMF to analyze Twitter data.

          Related collections

          Most cited references69

          • Record: found
          • Abstract: found
          • Article: not found

          Learning the parts of objects by non-negative matrix factorization.

          Is perception of the whole based on perception of its parts? There is psychological and physiological evidence for parts-based representations in the brain, and certain computational theories of object recognition rely on such representations. But little is known about how brains or computers might learn the parts of objects. Here we demonstrate an algorithm for non-negative matrix factorization that is able to learn parts of faces and semantic features of text. This is in contrast to other methods, such as principal components analysis and vector quantization, that learn holistic, not parts-based, representations. Non-negative matrix factorization is distinguished from the other methods by its use of non-negativity constraints. These constraints lead to a parts-based representation because they allow only additive, not subtractive, combinations. When non-negative matrix factorization is implemented as a neural network, parts-based representations emerge by virtue of two properties: the firing rates of neurons are never negative and synaptic strengths do not change sign.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Big Data, new epistemologies and paradigm shifts

            D Kitchin (2014)
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Mining meaning from online ratings and reviews: Tourist satisfaction analysis using latent dirichlet allocation

                Bookmark

                Author and article information

                Contributors
                Journal
                Front Sociol
                Front Sociol
                Front. Sociol.
                Frontiers in Sociology
                Frontiers Media S.A.
                2297-7775
                06 May 2022
                2022
                06 May 2022
                : 7
                : 886498
                Affiliations
                [1] 1Innovation and Management in Tourism, Salzburg University of Applied Sciences , Salzburg, Austria
                [2] 2Department of Tourism and Service Management, Modul University Vienna , Vienna, Austria
                Author notes

                Edited by: Dimitri Prandner, Johannes Kepler University of Linz, Austria

                Reviewed by: Tobias Wolbring, University of Erlangen Nuremberg, Germany; Ruben Bach, University of Mannheim, Germany

                *Correspondence: Joanne Yu joanne.yu@ 123456modul.ac.at

                This article was submitted to Sociological Theory, a section of the journal Frontiers in Sociology

                Article
                10.3389/fsoc.2022.886498
                9120935
                35602001
                5f2df5e4-e351-42b2-9112-240de5c07f98
                Copyright © 2022 Egger and Yu.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

                History
                : 28 February 2022
                : 19 April 2022
                Page count
                Figures: 5, Tables: 5, Equations: 0, References: 74, Pages: 16, Words: 11609
                Categories
                Sociology
                Methods

                topic model,machine learning,lda,top2vec,bertopic,nmf,twitter,covid travel

                Comments

                Comment on this article