5
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Flatten-T Swish: a thresholded ReLU-Swish-like activation function for deep learning

      Preprint
      , , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Activation functions are essential for deep learning methods to learn and perform complex tasks such as image classification. Rectified Linear Unit (ReLU) has been widely used and become the default activation function across the deep learning community since 2012. Although ReLU has been popular, however, the hard zero property of the ReLU has heavily hindered the negative values from propagating through the network. Consequently, the deep neural network has not been benefited from the negative representations. In this work, an activation function called Flatten-T Swish (FTS) that leverage the benefit of the negative values is proposed. To verify its performance, this study evaluates FTS with ReLU and several recent activation functions. Each activation function is trained using MNIST dataset on five different deep fully connected neural networks (DFNNs) with depth vary from five to eight layers. For a fair evaluation, all DFNNs are using the same configuration settings. Based on the experimental results, FTS with a threshold value, T=-0.20 has the best overall performance. As compared with ReLU, FTS (T=-0.20) improves MNIST classification accuracy by 0.13%, 0.70%, 0.67%, 1.07% and 1.15% on wider 5 layers, slimmer 5 layers, 6 layers, 7 layers and 8 layers DFNNs respectively. Apart from this, the study also noticed that FTS converges twice as fast as ReLU. Although there are other existing activation functions are also evaluated, this study elects ReLU as the baseline activation function.

          Related collections

          Most cited references10

          • Record: found
          • Abstract: found
          • Article: found
          Is Open Access

          Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

          Rectified activation units (rectifiers) are essential for state-of-the-art neural networks. In this work, we study rectifier neural networks for image classification from two aspects. First, we propose a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit. PReLU improves model fitting with nearly zero extra computational cost and little overfitting risk. Second, we derive a robust initialization method that particularly considers the rectifier nonlinearities. This method enables us to train extremely deep rectified models directly from scratch and to investigate deeper or wider network architectures. Based on our PReLU networks (PReLU-nets), we achieve 4.94% top-5 test error on the ImageNet 2012 classification dataset. This is a 26% relative improvement over the ILSVRC 2014 winner (GoogLeNet, 6.66%). To our knowledge, our result is the first to surpass human-level performance (5.1%, Russakovsky et al.) on this visual recognition challenge.
            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            What is the best multi-stage architecture for object recognition?

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Deep learning massively accelerates super-resolution localization microscopy

                Bookmark

                Author and article information

                Journal
                15 December 2018
                Article
                10.26555/ijain.v4i2.249
                1812.06247
                257860c8-a639-4484-8c90-912c7b69600d

                http://creativecommons.org/licenses/by-nc-sa/4.0/

                History
                Custom metadata
                International Journal of Advances in Intelligent Informatics, 4(2), 76-86
                cs.NE cs.LG stat.ML

                Machine learning,Neural & Evolutionary computing,Artificial intelligence
                Machine learning, Neural & Evolutionary computing, Artificial intelligence

                Comments

                Comment on this article