24
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      D-Nikud: Enhancing Hebrew Diacritization with LSTM and Pretrained Models

      Preprint
      ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          D-Nikud, a novel approach to Hebrew diacritization that integrates the strengths of LSTM networks and BERT-based (transformer) pre-trained model. Inspired by the methodologies employed in Nakdimon, we integrate it with the TavBERT pre-trained model, our system incorporates advanced architectural choices and diverse training data. Our experiments showcase state-of-the-art results on several benchmark datasets, with a particular emphasis on modern texts and more specified diacritization like gender.

          Related collections

          Author and article information

          Journal
          30 January 2024
          Article
          2402.00075
          f3849939-5c5b-419a-b408-3436eb6aafa5

          http://creativecommons.org/licenses/by/4.0/

          History
          Custom metadata
          cs.CL

          Theoretical computer science
          Theoretical computer science

          Comments

          Comment on this article