0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      MaPPing Your Model: Assessing the Impact of Adversarial Attacks on LLM-based Programming Assistants

      Preprint
      ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          LLM-based programming assistants offer the promise of programming faster but with the risk of introducing more security vulnerabilities. Prior work has studied how LLMs could be maliciously fine-tuned to suggest vulnerabilities more often. With the rise of agentic LLMs, which may use results from an untrusted third party, there is a growing risk of attacks on the model's prompt. We introduce the Malicious Programming Prompt (MaPP) attack, in which an attacker adds a small amount of text to a prompt for a programming task (under 500 bytes). We show that our prompt strategy can cause an LLM to add vulnerabilities while continuing to write otherwise correct code. We evaluate three prompts on seven common LLMs, from basic to state-of-the-art commercial models. Using the HumanEval benchmark, we find that our prompts are broadly effective, with no customization required for different LLMs. Furthermore, the LLMs that are best at HumanEval are also best at following our malicious instructions, suggesting that simply scaling language models will not prevent MaPP attacks. Using a dataset of eight CWEs in 16 scenarios, we find that MaPP attacks are also effective at implementing specific and targeted vulnerabilities across a range of models. Our work highlights the need to secure LLM prompts against manipulation as well as rigorously auditing code generated with the help of LLMs.

          Related collections

          Author and article information

          Journal
          12 July 2024
          Article
          2407.11072
          3b2c2df3-a633-4579-bf0d-7353b81a80f2

          http://creativecommons.org/licenses/by/4.0/

          History
          Custom metadata
          6 pages, 5 figures, Proceedings of the ICML 2024 Workshop on Trustworthy Multimodal Foundation Models and AI Agents
          cs.CR cs.AI

          Security & Cryptology,Artificial intelligence
          Security & Cryptology, Artificial intelligence

          Comments

          Comment on this article