2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Towards Boosting Many-to-Many Multilingual Machine Translation with Large Language Models

      journal-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The training paradigm for machine translation has gradually shifted, from learning neural machine translation (NMT) models with extensive parallel corpora to instruction finetuning on multilingual large language models (LLMs) with high-quality translation pairs. In this paper, we focus on boosting many-to-many multilingual translation of LLMs with an emphasis on zero-shot translation directions. We demonstrate that prompt strategies adopted during finetuning are crucial to zero-shot translation and introduce a cross-lingual consistency regularization, XConST, to bridge the representation gap among different languages and improve zero-shot translation performance. XConST is not a new method, but a version of CrossConST (Gao et al., 2023a) adapted for translation instruction finetuning with LLMs. Experimental results on ALMA (Xu et al., 2023), Tower (Team, 2024), and LLaMA-2 (Touvron et al., 2023) show that our approach consistently improves translation performance. Our implementations are available at https://github.com/gpengzhi/CrossConST-LLM.

          Related collections

          Author and article information

          Journal
          arXiv
          2024
          11 January 2024
          12 January 2024
          07 February 2024
          08 February 2024
          January 2024
          Article
          10.48550/ARXIV.2401.05861
          f2486350-cdb3-4c6f-ae6a-d173843eadd6

          arXiv.org perpetual, non-exclusive license

          History

          Computation and Language (cs.CL),FOS: Computer and information sciences

          Comments

          Comment on this article