The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Regression analysis makes up a large part of supervised machine learning, and consists of the prediction of a continuous independent target from a set of other predictor variables. The difference between binary classification and regression is in the target range: in binary classification, the target can have only two values (usually encoded as 0 and 1), while in regression the target can have multiple values. Even if regression analysis has been employed in a huge number of machine learning studies, no consensus has been reached on a single, unified, standard metric to assess the results of the regression itself. Many studies employ the mean square error (MSE) and its rooted variant (RMSE), or the mean absolute error (MAE) and its percentage variant (MAPE). Although useful, these rates share a common drawback: since their values can range between zero and +infinity, a single value of them does not say much about the performance of the regression with respect to the distribution of the ground truth elements. In this study, we focus on two rates that actually generate a high score only if the majority of the elements of a ground truth group has been correctly predicted: the coefficient of determination (also known as R-squared or R ²) and the symmetric mean absolute percentage error (SMAPE). After showing their mathematical properties, we report a comparison between R ² and SMAPE in several use cases and in two real medical scenarios. Our results demonstrate that the coefficient of determination ( R-squared) is more informative and truthful than SMAPE, and does not have the interpretability limitations of MSE, RMSE, MAE and MAPE. We therefore suggest the usage of R-squared as standard metric to evaluate regression analyses in any scientific domain.

Related collections

Most cited references 97

Record: found
Abstract: found
Article: found

Is Open Access

The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation

Davide Chicco, Giuseppe Jurman (2020)

Background To evaluate binary classifications and their confusion matrices, scientific researchers can employ several statistical rates, accordingly to the goal of the experiment they are investigating. Despite being a crucial issue in machine learning, no widespread consensus has been reached on a unified elective chosen measure yet. Accuracy and F1 score computed on confusion matrices have been (and still are) among the most popular adopted metrics in binary classification tasks. However, these statistical measures can dangerously show overoptimistic inflated results, especially on imbalanced datasets. Results The Matthews correlation coefficient (MCC), instead, is a more reliable statistical rate which produces a high score only if the prediction obtained good results in all of the four confusion matrix categories (true positives, false negatives, true negatives, and false positives), proportionally both to the size of positive elements and the size of negative elements in the dataset. Conclusions In this article, we show how MCC produces a more informative and truthful score in evaluating binary classifications than accuracy and F1 score, by first explaining the mathematical properties, and then the asset of MCC in six synthetic use cases and in a real genomics scenario. We believe that the Matthews correlation coefficient should be preferred to accuracy and F1 score in evaluating binary classification tasks by all scientific communities.

0 comments Cited 831 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Is Open Access

Root mean square error (RMSE) or mean absolute error (MAE)? – Arguments against avoiding RMSE in the literature

T. Chai, R. Draxler (2014)

Both the root mean square error (RMSE) and the mean absolute error (MAE) are regularly employed in model evaluation studies. Willmott and Matsuura (2005) have suggested that the RMSE is not a good indicator of average model performance and might be a misleading indicator of average error, and thus the MAE would be a better metric for that purpose. While some concerns over using RMSE raised by Willmott and Matsuura (2005) and Willmott et al. (2009) are valid, the proposed avoidance of RMSE in favor of MAE is not the solution. Citing the aforementioned papers, many researchers chose MAE over RMSE to present their model evaluation statistics when presenting or adding the RMSE measures could be more beneficial. In this technical note, we demonstrate that the RMSE is not ambiguous in its meaning, contrary to what was claimed by Willmott et al. (2009). The RMSE is more appropriate to represent model performance than the MAE when the error distribution is expected to be Gaussian. In addition, we show that the RMSE satisfies the triangle inequality requirement for a distance metric, whereas Willmott et al. (2009) indicated that the sums-of-squares-based statistics do not satisfy this rule. In the end, we discussed some circumstances where using the RMSE will be more beneficial. However, we do not contend that the RMSE is superior over the MAE. Instead, a combination of metrics, including but certainly not limited to RMSEs and MAEs, are often required to assess model performance.

0 comments Cited 605 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: found

Is Open Access

The coefficient of determination R2 and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded

Shinichi Nakagawa, Paul Johnson, Holger Schielzeth (2017)

The coefficient of determination R 2 quantifies the proportion of variance explained by a statistical model and is an important summary statistic of biological interest. However, estimating R 2 for generalized linear mixed models (GLMMs) remains challenging. We have previously introduced a version of R 2 that we called for Poisson and binomial GLMMs, but not for other distributional families. Similarly, we earlier discussed how to estimate intra-class correlation coefficients (ICCs) using Poisson and binomial GLMMs. In this paper, we generalize our methods to all other non-Gaussian distributions, in particular to negative binomial and gamma distributions that are commonly used for modelling biological data. While expanding our approach, we highlight two useful concepts for biologists, Jensen's inequality and the delta method, both of which help us in understanding the properties of GLMMs. Jensen's inequality has important implications for biologically meaningful interpretation of GLMMs, whereas the delta method allows a general derivation of variance associated with non-Gaussian distributions. We also discuss some special considerations for binomial GLMMs with binary or proportion data. We illustrate the implementation of our extension by worked examples from the field of ecology and evolution in the R environment. However, our method can be used across disciplines and regardless of statistical environments.

0 comments Cited 589 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Davide Chicco:

ORCID: http://orcid.org/0000-0001-9655-7142

Journal

Journal ID (nlm-ta): PeerJ Comput Sci

Journal ID (iso-abbrev): PeerJ Comput Sci

Journal ID (pmc): peerj-cs

Journal ID (publisher-id): peerj-cs

Title: PeerJ Computer Science

Publisher: PeerJ Inc. (San Diego, USA )

ISSN (Electronic): 2376-5992

Publication date (Electronic): 5 July 2021

Publication date Collection: 2021

Volume: 7

Electronic Location Identifier: e623

Affiliations

[1 ]Institute of Health Policy Management and Evaluation, University of Toronto , Toronto, Canada

[2 ]Groningen Institute for Educational Research, University of Groningen , Groningen, Netherlands

[3 ]Data Science for Health Unit, Fondazione Bruno Kessler , Trento, Italy

Author information

Davide Chicco http://orcid.org/0000-0001-9655-7142

Matthijs J. Warrens http://orcid.org/0000-0002-7302-640X

Giuseppe Jurman http://orcid.org/0000-0002-2705-5728

Article

Publisher ID: cs-623

DOI: 10.7717/peerj-cs.623

PMC ID: 8279135

PubMed ID: 34307865

SO-VID: 973ddeed-98b0-4779-81ae-bf031b0c1490

License:

This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, reproduction and adaptation in any medium and for any purpose provided that it is properly attributed. For attribution, the original author(s), title, publication source (PeerJ Computer Science) and either DOI or URL of the article must be cited.

History

Date received : 26 March 2021

Date accepted : 15 June 2021

Funding

The authors received no funding for this work.

Comments

Comment on this article

scite_

Cited by 226

See all cited by

Most referenced authors 771

See all reference authors

- Version 1

The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation

Read this article at

Abstract

Related collections

Core Readings in Statistical Mediation Analysis

Most cited references 97

The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation

Root mean square error (RMSE) or mean absolute error (MAE)? – Arguments against avoiding RMSE in the literature

The coefficient of determination R2 and intra-class correlation coefficient from generalized linear mixed-effects models revisited and expanded

Author and article information

Contributors

Journal

Affiliations

Author information

Article

History

Funding

Categories

Comments

Comment on this article

Similar content 74

Cited by 226

Most referenced authors 771