3
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Domain Classification-based Source-specific Term Penalization for Domain Adaptation in Hate-speech Detection

      Preprint
      , , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          State-of-the-art approaches for hate-speech detection usually exhibit poor performance in out-of-domain settings. This occurs, typically, due to classifiers overemphasizing source-specific information that negatively impacts its domain invariance. Prior work has attempted to penalize terms related to hate-speech from manually curated lists using feature attribution methods, which quantify the importance assigned to input terms by the classifier when making a prediction. We, instead, propose a domain adaptation approach that automatically extracts and penalizes source-specific terms using a domain classifier, which learns to differentiate between domains, and feature-attribution scores for hate-speech classes, yielding consistent improvements in cross-domain evaluation.

          Related collections

          Author and article information

          Journal
          18 September 2022
          Article
          2209.08681
          631cbde6-fe36-4c17-a04c-358a290e36ad

          http://creativecommons.org/licenses/by/4.0/

          History
          Custom metadata
          COLING 2022 pre-print
          cs.CL

          Theoretical computer science
          Theoretical computer science

          Comments

          Comment on this article