ScienceOpen: research and publishing network

For Researchers

Search
Advanced search

2

views

    

0

recommends

0

shares

Record: found
Abstract: found
Article: found

Is Open Access

Automatic Compiler Based FPGA Accelerator for CNN Training

Preprint

Author(s): Shreyas Kolala Venkataramanaiah , Yufei Ma , Shihui Yin , Eriko Nurvithadhi , Aravind Dasu , Yu Cao , Jae-sun Seo

Publication date Created: 15 August 2019

Read this article at

ScienceOpen ArXiv

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Training of convolutional neural networks (CNNs)on embedded platforms to support on-device learning is earning vital importance in recent days. Designing flexible training hard-ware is much more challenging than inference hardware, due to design complexity and large computation/memory requirement. In this work, we present an automatic compiler-based FPGA accelerator with 16-bit fixed-point precision for complete CNNtraining, including Forward Pass (FP), Backward Pass (BP) and Weight Update (WU). We implemented an optimized RTL library to perform training-specific tasks and developed an RTL compiler to automatically generate FPGA-synthesizable RTL based on user-defined constraints. We present a new cyclic weight storage/access scheme for on-chip BRAM and off-chip DRAMto efficiently implement non-transpose and transpose operations during FP and BP phases, respectively. Representative CNNs for CIFAR-10 dataset are implemented and trained on Intel Stratix 10-GX FPGA using proposed hardware architecture, demonstrating up to 479 GOPS performance.

Related collections

Most cited references 12

Record: found
Abstract: not found
Conference Proceedings: not found

Squeeze-and-Excitation Networks

Jie Hu, Li Shen, Gang Sun (2018)

0 comments Cited 1940 times – based on 0 reviews

Record: found
Abstract: not found
Conference Proceedings: not found

Optimizing FPGA-based Accelerator Design for Deep Convolutional Neural Networks

Chen Zhang, Peng Li, Guangyu Sun … (2015)

0 comments Cited 133 times – based on 0 reviews

Record: found
Abstract: not found
Conference Proceedings: not found

Can FPGAs Beat GPUs in Accelerating Next-Generation Deep Neural Networks?

Jaewoong Sim, Debbie Marr, Randy Huang … (2017)

0 comments Cited 38 times – based on 0 reviews

Author and article information

Journal

Publication date Created: 15 August 2019

Article

ArXiV ID: 1908.06724

SO-VID: b5b8b807-0b97-4230-bc83-f2102af58cf8

License:

http://creativecommons.org/licenses/by/4.0/

History

Custom metadata

Comments 6 pages, 9 figures, paper accepted at FPL2019 conference

Categories cs.LG cs.NE eess.SP

ScienceOpen disciplines: Neural & Evolutionary computing,Artificial intelligence,Electrical engineering

Data availability:

ScienceOpen disciplines: Neural & Evolutionary computing, Artificial intelligence, Electrical engineering

Comments

Comment on this article

What's Hot on ScienceOpen? Click for top 10 articles of the week