0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Evaluating automatic annotation of lexicon-based models for stance detection of M-pox tweets from May 1st to Sep 5th, 2022

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Manually labeling data for supervised learning is time and energy consuming; therefore, lexicon-based models such as VADER and TextBlob are used to automatically label data. However, it is argued that automated labels do not have the accuracy required for training an efficient model. Although automated labeling is frequently used for stance detection, automated stance labels have not been properly evaluated, in the previous works. In this work, to assess the accuracy of VADER and TextBlob automated labels for stance analysis, we first manually label a Twitter, now X, dataset related to M-pox stance detection. We then fine-tune different transformer-based models on the hand-labeled M-pox dataset, and compare their accuracy before and after fine-tuning, with the accuracy of automated labeled data. Our results indicated that the fine-tuned models surpassed the accuracy of VADER and TextBlob automated labels by up to 38% and 72.5%, respectively. Topic modeling further shows that fine-tuning diminished the scope of misclassified tweets to specific sub-topics. We conclude that fine-tuning transformer models on hand-labeled data for stance detection, elevates the accuracy to a superior level that is significantly higher than automated stance detection labels. This study verifies that automated stance detection labels are not reliable for sensitive use-cases such as health-related purposes. Manually labeled data is more convenient for developing Natural Language Processing (NLP) models that study and analyze mass opinions and conversations on social media platforms, during crises such as pandemics and epidemics.

          Author summary

          Social media platforms are pivotal in shaping public opinion during health crises, influencing policy-making and crisis management. Challenges such as labor-intensive manual labeling and dataset biases highlight the need for optimized stance detection methods. Our study assessed VADER and TextBlob for stance detection during the M-pox outbreak on social media, comparing their automated labels with our manually labeled data. Transformer-based models consistently outperformed lexicon-based approaches, showing significant improvements both before and after fine-tuning. Specifically, models pre-trained on the COVID-19 tweets demonstrated over a 20% enhancement in accurately classifying M-pox tweets. Through topic modeling of misclassified tweets, nuanced sub-topics in M-pox discussions were identified, highlighting the value of integrating multi-modal data and using hand-labeled datasets for comprehensive sentiment analysis across platforms and contexts. Policymakers and healthcare authorities can utilize these insights to craft precise communication strategies, combat misinformation, and address public concerns effectively. Advancements in machine learning for health-related stance detection hold promise for optimizing crisis management and informing evidence-based policy-making during emerging epidemics and pandemics, with implications for future research and policy development.

          Related collections

          Most cited references24

          • Record: found
          • Abstract: found
          • Article: not found

          VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text

          The inherent nature of social media content poses serious challenges to practical applications of sentiment analysis. We present VADER, a simple rule-based model for general sentiment analysis, and compare its effectiveness to eleven typical state-of-practice benchmarks including LIWC, ANEW, the General Inquirer, SentiWordNet, and machine learning oriented techniques relying on Naive Bayes, Maximum Entropy, and Support Vector Machine (SVM) algorithms. Using a combination of qualitative and quantitative methods, we first construct and empirically validate a gold-standard list of lexical features (along with their associated sentiment intensity measures) which are specifically attuned to sentiment in microblog-like contexts. We then combine these lexical features with consideration for five general rules that embody grammatical and syntactical conventions for expressing and emphasizing sentiment intensity. Interestingly, using our parsimonious rule-based model to assess the sentiment of tweets, we find that VADER outperforms individual human raters (F1 Classification Accuracy = 0.96 and 0.84, respectively), and generalizes more favorably across contexts than any of our benchmarks.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found
            Is Open Access

            Pandemic Prevention: Lessons from COVID-19

            Coronavirus disease 2019 (COVID-19) is caused by the novel severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), which appeared in late 2019, generating a pandemic crisis with high numbers of COVID-19-related infected individuals and deaths in manifold countries worldwide. Lessons learned from COVID-19 can be used to prevent pandemic threats by designing strategies to support different policy responses, not limited to the health system, directed to reduce the risks of the emergence of novel viral agents, the diffusion of infectious diseases and negative impact in society.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              Infodemics and health misinformation: a systematic review of reviews

              Abstract Objective To compare and summarize the literature regarding infodemics and health misinformation, and to identify challenges and opportunities for addressing the issues of infodemics. Methods We searched MEDLINE®, Embase®, Cochrane Library of Systematic Reviews, Scopus and Epistemonikos on 6 May 2022 for systematic reviews analysing infodemics, misinformation, disinformation and fake news related to health. We grouped studies based on similarity and retrieved evidence on challenges and opportunities. We used the AMSTAR 2 approach to assess the reviews’ methodological quality. To evaluate the quality of the evidence, we used the Grading of Recommendations Assessment, Development and Evaluation guidelines. Findings Our search identified 31 systematic reviews, of which 17 were published. The proportion of health-related misinformation on social media ranged from 0.2% to 28.8%. Twitter, Facebook, YouTube and Instagram are critical in disseminating the rapid and far-reaching information. The most negative consequences of health misinformation are the increase of misleading or incorrect interpretations of available evidence, impact on mental health, misallocation of health resources and an increase in vaccination hesitancy. The increase of unreliable health information delays care provision and increases the occurrence of hateful and divisive rhetoric. Social media could also be a useful tool to combat misinformation during crises. Included reviews highlight the poor quality of published studies during health crises. Conclusion Available evidence suggests that infodemics during health emergencies have an adverse effect on society. Multisectoral actions to counteract infodemics and health misinformation are needed, including developing legal policies, creating and promoting awareness campaigns, improving health-related content in mass media and increasing people’s digital and health literacy.
                Bookmark

                Author and article information

                Contributors
                Role: ConceptualizationRole: VisualizationRole: Writing – original draftRole: Writing – review & editing
                Role: ConceptualizationRole: Formal analysisRole: ValidationRole: Writing – original draft
                Role: Project administrationRole: ResourcesRole: Software
                Role: Data curationRole: Formal analysisRole: MethodologyRole: Resources
                Role: ConceptualizationRole: Funding acquisitionRole: InvestigationRole: Project administrationRole: ResourcesRole: SoftwareRole: Visualization
                Role: ConceptualizationRole: Formal analysisRole: Funding acquisitionRole: InvestigationRole: MethodologyRole: ValidationRole: Visualization
                Role: ConceptualizationRole: Data curationRole: InvestigationRole: Project administrationRole: SoftwareRole: ValidationRole: Writing – review & editing
                Role: ConceptualizationRole: InvestigationRole: MethodologyRole: ResourcesRole: SoftwareRole: SupervisionRole: ValidationRole: Visualization
                Role: Formal analysisRole: InvestigationRole: ResourcesRole: SoftwareRole: ValidationRole: VisualizationRole: Writing – review & editing
                Role: ConceptualizationRole: InvestigationRole: MethodologyRole: ResourcesRole: SoftwareRole: ValidationRole: Visualization
                Role: ConceptualizationRole: Data curationRole: Formal analysisRole: MethodologyRole: ResourcesRole: ValidationRole: VisualizationRole: Writing – review & editing
                Role: Editor
                Journal
                PLOS Digit Health
                PLOS Digit Health
                plos
                PLOS Digital Health
                Public Library of Science (San Francisco, CA USA )
                2767-3170
                July 2024
                30 July 2024
                : 3
                : 7
                : e0000545
                Affiliations
                [1 ] School of Physics and Institute for Collider Particle Physics, University of the Witwatersrand, Johannesburg, South Africa
                [2 ] iThemba LABS, National Research Foundation, Cape Town, South Africa
                [3 ] Africa-Canada Artificial Intelligence and Data Innovation Consortium (ACADIC), York University, Toronto, Canada
                [4 ] Laboratory for Industrial and Applied Mathematics, York University, Toronto, Canada
                [5 ] Department of Computer Science, Brock University, St. Catharines, Niagara Region, Ontorio, Canada
                [6 ] Department of Mathematics, Bahen Center for Information Technology, University of Toronto, Canada
                [7 ] Global South Artificial Intelligence for Pandemic Preparedness and Response Network (AI4PEP), York University, Toronto, Canada
                [8 ] Artificial Intelligence & Mathematical Modeling Lab (AIMMLab), Dala Lana School of Public Health, University of Toronto, Canada
                Dalhousie University Faculty of Computer Science, CANADA
                Author notes

                The authors have declared that no competing interests exist.

                Author information
                https://orcid.org/0000-0002-8963-4290
                https://orcid.org/0000-0002-7557-5672
                Article
                PDIG-D-23-00389
                10.1371/journal.pdig.0000545
                11288444
                39078813
                42978ded-dd1e-43a1-b713-af426f774a33
                © 2024 Perikli et al

                This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

                History
                : 23 October 2023
                : 3 June 2024
                Page count
                Figures: 2, Tables: 5, Pages: 14
                Funding
                Funded by: funder-id http://dx.doi.org/10.13039/501100000193, International Development Research Centre;
                Award Recipient :
                Funded by: Swedish International Development Cooperation Agency (SIDA)
                Award ID: 109559-001
                Award Recipient :
                JDK acknowledges both Canada’s International Development Research Centre (IDRC), and the Swedish International Development Cooperation Agency (SIDA) (Grant No. 109559-001) for funding this research. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
                Categories
                Research Article
                Medicine and Health Sciences
                Medical Conditions
                Infectious Diseases
                Viral Diseases
                Covid 19
                Social Sciences
                Sociology
                Communications
                Social Communication
                Social Media
                Twitter
                Computer and Information Sciences
                Network Analysis
                Social Networks
                Social Media
                Twitter
                Social Sciences
                Sociology
                Social Networks
                Social Media
                Twitter
                Medicine and Health Sciences
                Epidemiology
                Pandemics
                Social Sciences
                Sociology
                Communications
                Social Communication
                Social Media
                Computer and Information Sciences
                Network Analysis
                Social Networks
                Social Media
                Social Sciences
                Sociology
                Social Networks
                Social Media
                Medicine and Health Sciences
                Medical Conditions
                Infectious Diseases
                Infectious Disease Control
                Vaccines
                Medicine and Health Sciences
                Diagnostic Medicine
                Virus Testing
                Medicine and Health Sciences
                Health Care
                Health Care Policy
                Computer and Information Sciences
                Artificial Intelligence
                Machine Learning
                Custom metadata
                Due to Twitter’s developers’ policy, only Tweet IDs can be shared with public. All our data are available as a supplementary file to this manuscript ( S1 File).
                COVID-19

                Comments

                Comment on this article