6
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Quark: An Integer RISC-V Vector Processor for Sub-Byte Quantized DNN Inference

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In this paper, we present Quark, an integer RISC-V vector processor specifically tailored for sub-byte DNN inference. Quark is implemented in GlobalFoundries' 22FDX FD-SOI technology. It is designed on top of Ara, an open-source 64-bit RISC-V vector processor. To accommodate sub-byte DNN inference, Quark extends Ara by adding specialized vector instructions to perform sub-byte quantized operations. We also remove the floating-point unit from Quarks' lanes and use the CVA6 RISC-V scalar core for the re-scaling operations that are required in quantized neural network inference. This makes each lane of Quark 2 times smaller and 1.9 times more power efficient compared to the ones of Ara. In this paper we show that Quark can run quantized models at sub-byte precision. Notably we show that for 1-bit and 2-bit quantized models, Quark can accelerate computation of Conv2d over various ranges of inputs and kernel sizes.

          Related collections

          Author and article information

          Journal
          12 February 2023
          Article
          2302.05996
          3d27f0d6-6537-49c7-9390-8deddeb039bf

          http://creativecommons.org/licenses/by/4.0/

          History
          Custom metadata
          5 pages. Accepted for publication in the 56th International Symposium on Circuits and Systems (ISCAS 2023)
          cs.AR

          Hardware architecture
          Hardware architecture

          Comments

          Comment on this article