219
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: not found
      • Article: not found

      Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Existing deep convolutional neural networks (CNNs) require a fixed-size (e.g., 224 × 224) input image. This requirement is "artificial" and may reduce the recognition accuracy for the images or sub-images of an arbitrary size/scale. In this work, we equip the networks with another pooling strategy, "spatial pyramid pooling", to eliminate the above requirement. The new network structure, called SPP-net, can generate a fixed-length representation regardless of image size/scale. Pyramid pooling is also robust to object deformations. With these advantages, SPP-net should in general improve all CNN-based image classification methods. On the ImageNet 2012 dataset, we demonstrate that SPP-net boosts the accuracy of a variety of CNN architectures despite their different designs. On the Pascal VOC 2007 and Caltech101 datasets, SPP-net achieves state-of-the-art classification results using a single full-image representation and no fine-tuning. The power of SPP-net is also significant in object detection. Using SPP-net, we compute the feature maps from the entire image only once, and then pool features in arbitrary regions (sub-images) to generate fixed-length representations for training the detectors. This method avoids repeatedly computing the convolutional features. In processing test images, our method is 24-102 × faster than the R-CNN method, while achieving better or comparable accuracy on Pascal VOC 2007. In ImageNet Large Scale Visual Recognition Challenge (ILSVRC) 2014, our methods rank #2 in object detection and #3 in image classification among all 38 teams. This manuscript also introduces the improvement made for this competition.

          Related collections

          Author and article information

          Journal
          IEEE Transactions on Pattern Analysis and Machine Intelligence
          IEEE Trans. Pattern Anal. Mach. Intell.
          Institute of Electrical and Electronics Engineers (IEEE)
          0162-8828
          2160-9292
          September 1 2015
          September 1 2015
          : 37
          : 9
          : 1904-1916
          Article
          10.1109/TPAMI.2015.2389824
          26353135
          f2e3630d-1b80-4ea8-9ebb-cc42bc06b0e3
          © 2015

          https://ieeexplore.ieee.org/Xplorehelp/downloads/license-information/IEEE.html

          History

          Comments

          Comment on this article