18
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Dataset decay and the problem of sequential analyses on open datasets

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Open data allows researchers to explore pre-existing datasets in new ways. However, if many researchers reuse the same dataset, multiple statistical testing may increase false positives. Here we demonstrate that sequential hypothesis testing on the same dataset by multiple researchers can inflate error rates. We go on to discuss a number of correction procedures that can reduce the number of false positives, and the challenges associated with these correction procedures.

          Related collections

          Most cited references31

          • Record: found
          • Abstract: not found
          • Article: not found

          Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Choosing Prediction Over Explanation in Psychology: Lessons From Machine Learning.

            Psychology has historically been concerned, first and foremost, with explaining the causal mechanisms that give rise to behavior. Randomized, tightly controlled experiments are enshrined as the gold standard of psychological research, and there are endless investigations of the various mediating and moderating variables that govern various behaviors. We argue that psychology's near-total focus on explaining the causes of behavior has led much of the field to be populated by research programs that provide intricate theories of psychological mechanism but that have little (or unknown) ability to predict future behaviors with any appreciable accuracy. We propose that principles and techniques from the field of machine learning can help psychology become a more predictive science. We review some of the fundamental concepts and tools of machine learning and point out examples where these concepts have been used to conduct interesting and important psychological research that focuses on predictive research questions. We suggest that an increased focus on prediction, rather than explanation, can ultimately lead us to greater understanding of behavior.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              An Agenda for Purely Confirmatory Research.

              The veracity of substantive research claims hinges on the way experimental data are collected and analyzed. In this article, we discuss an uncomfortable fact that threatens the core of psychology's academic enterprise: almost without exception, psychologists do not commit themselves to a method of data analysis before they see the actual data. It then becomes tempting to fine tune the analysis to the data in order to obtain a desired result-a procedure that invalidates the interpretation of the common statistical tests. The extent of the fine tuning varies widely across experiments and experimenters but is almost impossible for reviewers and readers to gauge. To remedy the situation, we propose that researchers preregister their studies and indicate in advance the analyses they intend to conduct. Only these analyses deserve the label "confirmatory," and only for these analyses are the common statistical tests valid. Other analyses can be carried out but these should be labeled "exploratory." We illustrate our proposal with a confirmatory replication attempt of a study on extrasensory perception.
                Bookmark

                Author and article information

                Contributors
                Role: Reviewing Editor
                Role: Senior Editor
                Journal
                eLife
                Elife
                eLife
                eLife
                eLife Sciences Publications, Ltd
                2050-084X
                19 May 2020
                2020
                : 9
                : e53498
                Affiliations
                [1 ]Department of Psychology, Stanford University StanfordUnited States
                [2 ]Department of Clinical Neuroscience, Karolinska Institutet StockholmSweden
                [3 ]Department of Philosophy, Stanford University StanfordUnited States
                National Institute of Mental Health, National Institutes of Health United States
                eLife United Kingdom
                National Institute of Mental Health, National Institutes of Health United States
                University of Glasgow United Kingdom
                Author information
                https://orcid.org/0000-0002-0533-6035
                https://orcid.org/0000-0001-5003-0572
                https://orcid.org/0000-0001-6755-0259
                Article
                53498
                10.7554/eLife.53498
                7237204
                32425159
                ac6446e7-8d41-434b-a35f-5b0289a32340
                © 2020, Thompson et al

                This article is distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use and redistribution provided that the original author and source are credited.

                History
                : 12 November 2019
                : 04 May 2020
                Funding
                Funded by: FundRef http://dx.doi.org/10.13039/501100004063, Knut och Alice Wallenbergs Stiftelse;
                Award ID: 2016.0473
                Award Recipient :
                The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.
                Categories
                Feature Article
                Computational and Systems Biology
                Neuroscience
                Meta-Research
                Custom metadata
                Open data provides an opportunity to perform new analyses on preexisting data, but trade-offs are required to limit an increase in false positives.
                5

                Life sciences
                open data,sequential testing,multiple comparisons,multiple comparison correction,meta-research,human

                Comments

                Comment on this article