Search for authorsSearch for similar articles
2
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Automatic Compiler Based FPGA Accelerator for CNN Training

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Training of convolutional neural networks (CNNs)on embedded platforms to support on-device learning is earning vital importance in recent days. Designing flexible training hard-ware is much more challenging than inference hardware, due to design complexity and large computation/memory requirement. In this work, we present an automatic compiler-based FPGA accelerator with 16-bit fixed-point precision for complete CNNtraining, including Forward Pass (FP), Backward Pass (BP) and Weight Update (WU). We implemented an optimized RTL library to perform training-specific tasks and developed an RTL compiler to automatically generate FPGA-synthesizable RTL based on user-defined constraints. We present a new cyclic weight storage/access scheme for on-chip BRAM and off-chip DRAMto efficiently implement non-transpose and transpose operations during FP and BP phases, respectively. Representative CNNs for CIFAR-10 dataset are implemented and trained on Intel Stratix 10-GX FPGA using proposed hardware architecture, demonstrating up to 479 GOPS performance.

          Related collections

          Most cited references12

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Squeeze-and-Excitation Networks

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks

              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks?

                Bookmark

                Author and article information

                Journal
                15 August 2019
                Article
                1908.06724
                b5b8b807-0b97-4230-bc83-f2102af58cf8

                http://creativecommons.org/licenses/by/4.0/

                History
                Custom metadata
                6 pages, 9 figures, paper accepted at FPL2019 conference
                cs.LG cs.NE eess.SP

                Neural & Evolutionary computing,Artificial intelligence,Electrical engineering

                Comments

                Comment on this article