1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      TSSuBERT: How to Sum Up Multiple Years of Reading in a Few Tweets

      Read this article at

      ScienceOpenPublisher
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The development of deep neural networks and the emergence of pre-trained language models such as BERT allow to increase performance on many NLP tasks. However, these models do not meet the same popularity for tweet stream summarization, which is probably because their computation limitation requires to drastically truncate the textual input.

          Our contribution in this article is threefold. First, we propose a neural model to automatically and incrementally summarize huge tweet streams. This extractive model combines in an original way pre-trained language models and vocabulary frequency based representations to predict tweet salience. An additional advantage of the model is that it automatically adapts the size of the output summary according to the input tweet stream. Second, we detail an original methodology to construct tweet stream summarization datasets requiring little human effort. Third, we release the TES 2012-2016 dataset constructed using the aforementioned methodology. Baselines, oracle summaries, gold standard, and qualitative assessments are made publicly available.

          To evaluate our approach, we conducted extensive quantitative experiments using three different tweet collections as well as an additional qualitative evaluation. Results show that our method outperforms state-of-the-art ones. We believe that this work opens avenues of research for incremental summarization, which has not received much attention yet.

          Related collections

          Most cited references66

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          A Neural Attention Model for Abstractive Sentence Summarization

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Get To The Point: Summarization with Pointer-Generator Networks

              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              The use of MMR, diversity-based reranking for reordering documents and producing summaries

                Bookmark

                Author and article information

                Contributors
                (View ORCID Profile)
                (View ORCID Profile)
                (View ORCID Profile)
                Journal
                ACM Transactions on Information Systems
                ACM Trans. Inf. Syst.
                Association for Computing Machinery (ACM)
                1046-8188
                1558-2868
                October 31 2023
                April 10 2023
                October 31 2023
                : 41
                : 4
                : 1-33
                Affiliations
                [1 ]IRIT, Université de Toulouse, CNRS, Toulouse INP, UT3, France
                Article
                10.1145/3581786
                fc6c81b0-d3f6-4be9-81b0-11e000982f24
                © 2023
                History

                Comments

                Comment on this article