Enhanced mechanisms of pooling and channel attention for deep learning feature maps

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

The pooling function is vital for deep neural networks (DNNs). The operation is to generalize the representation of feature maps and progressively cut down the spatial size of feature maps to optimize the computing consumption of the network. Furthermore, the function is also the basis for the computer vision attention mechanism. However, as a matter of fact, pooling is a down-sampling operation, which makes the feature-map representation approximately to small translations with the summary statistic of adjacent pixels. As a result, the function inevitably leads to information loss more or less. In this article, we propose a fused max-average pooling (FMAPooling) operation as well as an improved channel attention mechanism (FMAttn) by utilizing the two pooling functions to enhance the feature representation for DNNs. Basically, the methods are to enhance multiple-level features extracted by max pooling and average pooling respectively. The effectiveness of the proposals is verified with VGG, ResNet, and MobileNetV2 architectures on CIFAR10/100 and ImageNet100. According to the experimental results, the FMAPooling brings up to 1.63% accuracy improvement compared with the baseline model; the FMAttn achieves up to 2.21% accuracy improvement compared with the previous channel attention mechanism. Furthermore, the proposals are extensible and could be embedded into various DNN models easily, or take the place of certain structures of DNNs. The computation burden introduced by the proposals is negligible.

Related collections

Most cited references 33

Record: found
Abstract: found
Article: not found

Attention Is All You Need

Ashish Vaswani, Noam Shazeer, Niki Parmar … (2017)

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data. 15 pages, 5 figures

0 comments Cited 2315 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: not found
Conference Proceedings: not found

Going deeper with convolutions

Christian Szegedy, Wei Liu, Yangqing Jia … (2016)

0 comments Cited 2185 times – based on 0 reviews

Bookmark

Record: found
Abstract: not found
Conference Proceedings: not found

Squeeze-and-Excitation Networks

Jie Hu, Li Shen, Gang Sun (2018)

0 comments Cited 1520 times – based on 0 reviews

Bookmark

All references

Author and article information

Contributors

Lin Meng

Journal

Journal ID (nlm-ta): PeerJ Comput Sci

Journal ID (iso-abbrev): PeerJ Comput Sci

Journal ID (publisher-id): peerj-cs

Title: PeerJ Computer Science

Publisher: PeerJ Inc. (San Diego, USA )

ISSN (Electronic): 2376-5992

Publication date (Electronic): 21 November 2022

Publication date Collection: 2022

Volume: 8

Electronic Location Identifier: e1161

Affiliations

[1 ]Graduate School of Science and Engineering, Ritsumeikan University , Kusatsu, Shiga, Japan

[2 ]College of Science and Engineering, Ritsumeikan University , Kusatsu, Shiga, Japan

Article

Publisher ID: cs-1161

DOI: 10.7717/peerj-cs.1161

PMC ID: 9748832

SO-VID: 383c9a8d-6214-4ac2-9d22-9d95650c7199

License:

This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.

History

Date received : 11 August 2022

Date accepted : 26 October 2022

Funding

This work received no funding for this work.

Enhanced mechanisms of pooling and channel attention for deep learning feature maps

Read this article at

Abstract

Related collections

Special feature: Europe and Education

Most cited references 33

Attention Is All You Need

Going deeper with convolutions

Squeeze-and-Excitation Networks

Author and article information

Contributors

Journal

Affiliations

Article

History

Funding

Categories

Comments

Comment on this article

Similar content 29

Most referenced authors 864