5
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Gradient Ascent Post-training Enhances Language Model Generalization

      Preprint
      , , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In this work, we empirically show that updating pretrained LMs (350M, 1.3B, 2.7B) with just a few steps of Gradient Ascent Post-training (GAP) on random, unlabeled text corpora enhances its zero-shot generalization capabilities across diverse NLP tasks. Specifically, we show that GAP can allow LMs to become comparable to 2-3x times larger LMs across 12 different NLP tasks. We also show that applying GAP on out-of-distribution corpora leads to the most reliable performance improvements. Our findings indicate that GAP can be a promising method for improving the generalization capability of LMs without any task-specific fine-tuning.

          Related collections

          Author and article information

          Journal
          12 June 2023
          Article
          2306.07052
          6ce486b5-6ef9-4041-8665-f5fb50c54485

          http://creativecommons.org/licenses/by/4.0/

          History
          Custom metadata
          ACL 2023 Main Conference (Short Paper)
          cs.CL cs.AI

          Theoretical computer science,Artificial intelligence
          Theoretical computer science, Artificial intelligence

          Comments

          Comment on this article