1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      SpeechPrune: Context-aware Token Pruning for Speech Information Retrieval

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We introduce Speech Information Retrieval (SIR), a new long-context task for Speech Large Language Models (Speech LLMs), and present SPIRAL, a 1,012-sample benchmark testing models' ability to extract critical details from approximately 90-second spoken inputs. While current Speech LLMs excel at short-form tasks, they struggle with the computational and representational demands of longer audio sequences. To address this limitation, we propose SpeechPrune, a training-free token pruning strategy that uses speech-text similarity and approximated attention scores to efficiently discard irrelevant tokens. In SPIRAL, SpeechPrune achieves accuracy improvements of 29% and up to 47% over the original model and the random pruning model at a pruning rate of 20%, respectively. SpeechPrune can maintain network performance even at a pruning level of 80%. This approach highlights the potential of token-level pruning for efficient and scalable long-form speech understanding.

          Related collections

          Author and article information

          Journal
          16 December 2024
          Article
          2412.12009
          9d242f30-711a-4bee-9a2c-17df2ca7eebe

          http://creativecommons.org/licenses/by/4.0/

          History
          Custom metadata
          Project page and dataset is available at https://speechprune.github.io/
          eess.AS cs.AI cs.CL cs.SD

          Theoretical computer science,Artificial intelligence,Graphics & Multimedia design,Electrical engineering

          Comments

          Comment on this article