44
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A solution to minimum sample size for regressions

      ,
      PLOS ONE
      Public Library of Science (PLoS)

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Regressions and meta-regressions are widely used to estimate patterns and effect sizes in various disciplines. However, many biological and medical analyses use relatively low sample size (N), contributing to concerns on reproducibility. What is the minimum N to identify the most plausible data pattern using regressions? Statistical power analysis is often used to answer that question, but it has its own problems and logically should follow model selection to first identify the most plausible model. Here we make null, simple linear and quadratic data with different variances and effect sizes. We then sample and use information theoretic model selection to evaluate minimum N for regression models. We also evaluate the use of coefficient of determination (R2) for this purpose; it is widely used but not recommended. With very low variance, both false positives and false negatives occurred at N < 8, but data shape was always clearly identified at N ≥ 8. With high variance, accurate inference was stable at N ≥ 25. Those outcomes were consistent at different effect sizes. Akaike Information Criterion weights (AICc w i ) were essential to clearly identify patterns (e.g., simple linear vs. null); R2 or adjusted R2 values were not useful. We conclude that a minimum N = 8 is informative given very little variance, but minimum N ≥ 25 is required for more variance. Alternative models are better compared using information theory indices such as AIC but not R2 or adjusted R2. Insufficient N and R2-based model selection apparently contribute to confusion and low reproducibility in various disciplines. To avoid those problems, we recommend that research based on regressions or meta-regressions use N ≥ 25.

          Related collections

          Most cited references59

          • Record: found
          • Abstract: found
          • Article: not found

          A basic introduction to fixed-effect and random-effects models for meta-analysis.

          There are two popular statistical models for meta-analysis, the fixed-effect model and the random-effects model. The fact that these two models employ similar sets of formulas to compute statistics, and sometimes yield similar estimates for the various parameters, may lead people to believe that the models are interchangeable. In fact, though, the models represent fundamentally different assumptions about the data. The selection of the appropriate model is important to ensure that the various statistics are estimated correctly. Additionally, and more fundamentally, the model serves to place the analysis in context. It provides a framework for the goals of the analysis as well as for the interpretation of the statistics. In this paper we explain the key assumptions of each model, and then outline the differences between the models. We conclude with a discussion of factors to consider when choosing between the two models. Copyright © 2010 John Wiley & Sons, Ltd. Copyright © 2010 John Wiley & Sons, Ltd.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Emergence of Scaling in Random Networks

            Systems as diverse as genetic networks or the World Wide Web are best described as networks with complex topology. A common property of many large networks is that the vertex connectivities follow a scale-free power-law distribution. This feature was found to be a consequence of two generic mechanisms: (i) networks expand continuously by the addition of new vertices, and (ii) new vertices attach preferentially to sites that are already well connected. A model based on these two ingredients reproduces the observed stationary scale-free distributions, which indicates that the development of large networks is governed by robust self-organizing phenomena that go beyond the particulars of the individual systems.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              The ASA's Statement onp-Values: Context, Process, and Purpose

                Bookmark

                Author and article information

                Contributors
                (View ORCID Profile)
                Journal
                PLOS ONE
                PLoS ONE
                Public Library of Science (PLoS)
                1932-6203
                February 21 2020
                February 21 2020
                : 15
                : 2
                : e0229345
                Article
                10.1371/journal.pone.0229345
                4d1682f9-5bd6-41d4-a07a-ad02888e3f04
                © 2020

                http://creativecommons.org/licenses/by/4.0/

                History

                Comments

                Comment on this article