29
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Big data need big theory too

      review-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The current interest in big data, machine learning and data analytics has generated the widespread impression that such methods are capable of solving most problems without the need for conventional scientific methods of inquiry. Interest in these methods is intensifying, accelerated by the ease with which digitized data can be acquired in virtually all fields of endeavour, from science, healthcare and cybersecurity to economics, social sciences and the humanities. In multiscale modelling, machine learning appears to provide a shortcut to reveal correlations of arbitrary complexity between processes at the atomic, molecular, meso- and macroscales. Here, we point out the weaknesses of pure big data approaches with particular focus on biology and medicine, which fail to provide conceptual accounts for the processes to which they are applied. No matter their ‘depth’ and the sophistication of data-driven methods, such as artificial neural nets, in the end they merely fit curves to existing data. Not only do these methods invariably require far larger quantities of data than anticipated by big data aficionados in order to produce statistically reliable results, but they can also fail in circumstances beyond the range of the data used to train them because they are not designed to model the structural characteristics of the underlying system. We argue that it is vital to use theory as a guide to experimental design for maximal efficiency of data collection and to produce reliable predictive models and conceptual knowledge. Rather than continuing to fund, pursue and promote ‘blind’ big data projects with massive budgets, we call for more funding to be allocated to the elucidation of the multiscale and stochastic processes controlling the behaviour of complex systems, including those of life, medicine and healthcare.

          This article is part of the themed issue ‘Multiscale modelling at the physics–chemistry–biology interface’.

          Related collections

          Most cited references18

          • Record: found
          • Abstract: found
          • Article: not found

          The evolution of eusociality.

          Eusociality, in which some individuals reduce their own lifetime reproductive potential to raise the offspring of others, underlies the most advanced forms of social organization and the ecologically dominant role of social insects and humans. For the past four decades kin selection theory, based on the concept of inclusive fitness, has been the major theoretical attempt to explain the evolution of eusociality. Here we show the limitations of this approach. We argue that standard natural selection theory in the context of precise models of population structure represents a simpler and superior approach, allows the evaluation of multiple competing hypotheses, and provides an exact framework for interpreting empirical observations.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Evolutionary dynamics of cancer in response to targeted combination therapy

            In solid tumors, targeted treatments can lead to dramatic regressions, but responses are often short-lived because resistant cancer cells arise. The major strategy proposed for overcoming resistance is combination therapy. We present a mathematical model describing the evolutionary dynamics of lesions in response to treatment. We first studied 20 melanoma patients receiving vemurafenib. We then applied our model to an independent set of pancreatic, colorectal, and melanoma cancer patients with metastatic disease. We find that dual therapy results in long-term disease control for most patients, if there are no single mutations that cause cross-resistance to both drugs; in patients with large disease burden, triple therapy is needed. We also find that simultaneous therapy with two drugs is much more effective than sequential therapy. Our results provide realistic expectations for the efficacy of new drug combinations and inform the design of trials for new cancer therapeutics. DOI: http://dx.doi.org/10.7554/eLife.00747.001
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Long-term dynamics of adaptation in asexual populations.

              Experimental studies of evolution have increased greatly in number in recent years, stimulated by the growing power of genomic tools. However, organismal fitness remains the ultimate metric for interpreting these experiments, and the dynamics of fitness remain poorly understood over long time scales. Here, we examine fitness trajectories for 12 Escherichia coli populations during 50,000 generations. Mean fitness appears to increase without bound, consistent with a power law. We also derive this power-law relation theoretically by incorporating clonal interference and diminishing-returns epistasis into a dynamical model of changes in mean fitness over time.
                Bookmark

                Author and article information

                Journal
                Philos Trans A Math Phys Eng Sci
                Philos Trans A Math Phys Eng Sci
                RSTA
                roypta
                Philosophical transactions. Series A, Mathematical, physical, and engineering sciences
                The Royal Society
                1364-503X
                1471-2962
                13 November 2016
                13 November 2016
                : 374
                : 2080 , Theme issue ‘Multiscale modelling at the physics–chemistry–biology interface’ compiled and edited by P. V. Coveney, J. P. Boon and S. Succi
                : 20160153
                Affiliations
                [1 ]Centre for Computational Science, University College London , Gordon Street, London WC1H 0AJ, UK
                [2 ]Center for Bioinformatics and Genomic Systems Engineering, Texas A&M University , College Station, TX 77843-31283, USA
                [3 ]Science Museum , Exhibition Road, London SW7 2DD, UK
                Author notes
                Author information
                http://orcid.org/0000-0002-8787-7256
                Article
                rsta20160153
                10.1098/rsta.2016.0153
                5052735
                27698035
                a047098a-98b0-47cd-b4fc-5c11ff20b9f2
                © 2015 The Authors.

                Published by the Royal Society under the terms of the Creative Commons Attribution License http://creativecommons.org/licenses/by/4.0/, which permits unrestricted use, provided the original author and source are credited.

                History
                : 17 June 2016
                Funding
                Funded by: EPSRC for its support via the 2020 Science Programme;
                Award ID: EP/I017909/1
                Funded by: Qatar National Research Fund;
                Award ID: 7-1083-1-191
                Funded by: MRC for a Medical Bioinformatics;
                Award ID: MR/L016311/1
                Categories
                1003
                50
                22
                44
                Articles
                Opinion Piece
                Custom metadata
                November 13, 2016

                machine learning,big data,personalized medicine,biomedicine,epistemology

                Comments

                Comment on this article