9
views
0
recommends
+1 Recommend
1 collections
    0
    shares

      Submit your digital health research with an established publisher
      - celebrating 25 years of open access

      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Bringing Code to Data: Do Not Forget Governance

      research-article
      , PhD 1 , , , BCL, LLB 2 , , MSc, LGC 1 , , BA 1 , , PhD, ADE 2
      (Reviewer), (Reviewer), (Reviewer)
      Journal of Medical Internet Research
      JMIR Publications
      data management, privacy, ethics, research, data science, machine learning

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Developing or independently evaluating algorithms in biomedical research is difficult because of restrictions on access to clinical data. Access is restricted because of privacy concerns, the proprietary treatment of data by institutions (fueled in part by the cost of data hosting, curation, and distribution), concerns over misuse, and the complexities of applicable regulatory frameworks. The use of cloud technology and services can address many of the barriers to data sharing. For example, researchers can access data in high performance, secure, and auditable cloud computing environments without the need for copying or downloading. An alternative path to accessing data sets requiring additional protection is the model-to-data approach. In model-to-data, researchers submit algorithms to run on secure data sets that remain hidden. Model-to-data is designed to enhance security and local control while enabling communities of researchers to generate new knowledge from sequestered data. Model-to-data has not yet been widely implemented, but pilots have demonstrated its utility when technical or legal constraints preclude other methods of sharing. We argue that model-to-data can make a valuable addition to our data sharing arsenal, with 2 caveats. First, model-to-data should only be adopted where necessary to supplement rather than replace existing data-sharing approaches given that it requires significant resource commitments from data stewards and limits scientific freedom, reproducibility, and scalability. Second, although model-to-data reduces concerns over data privacy and loss of local control when sharing clinical data, it is not an ethical panacea. Data stewards will remain hesitant to adopt model-to-data approaches without guidance on how to do so responsibly. To address this gap, we explored how commitments to open science, reproducibility, security, respect for data subjects, and research ethics oversight must be re-evaluated in a model-to-data context.

          Related collections

          Most cited references36

          • Record: found
          • Abstract: found
          • Article: not found

          Prepublication data sharing.

          Rapid release of prepublication data has served the field of genomics well. Attendees at a workshop in Toronto recommend extending the practice to other biological data sets.
            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Towards FAIR principles for research software

              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Evaluating common data models for use with a longitudinal community registry

              Objective: To evaluate common data models (CDMs) to determine which is best suited for sharing data from a large, longitudinal, electronic health record (EHR)-based community registry. Materials and Methods: Four CDMs were chosen from models in use for clinical research data: Sentinel v5.0 (referred to as the Mini-Sentinel CDM in previous versions), PCORnet v3.0 (an extension of the Mini-Sentinel CDM), OMOP v5.0, and CDISC SDTM v1.4. Each model was evaluated against 11 criteria adapted from previous research. The criteria fell into six categories: content coverage, integrity, flexibility, ease of querying, standards compatibility, and ease and extent of implementation. Results: The OMOP CDM accommodated the highest percentage of our data elements (76%), fared well on other requirements, and had broader terminology coverage than the other models. Sentinel and PCORnet fell short in content coverage with 37% and 48% matches respectively. Although SDTM accommodated a significant percentage of data elements (55% true matches), 45% of the data elements mapped to SDTM’s extension mechanism, known as Supplemental Qualifiers, increasing the number of joins required to query the data. Conclusion: The OMOP CDM best met the criteria for supporting data sharing from longitudinal EHR-based studies. Conclusions may differ for other uses and associated data element sets, but the methodology reported here is easily adaptable to common data model evaluation for other uses.
                Bookmark

                Author and article information

                Contributors
                Journal
                J Med Internet Res
                J. Med. Internet Res
                JMIR
                Journal of Medical Internet Research
                JMIR Publications (Toronto, Canada )
                1439-4456
                1438-8871
                July 2020
                28 July 2020
                : 22
                : 7
                : e18087
                Affiliations
                [1 ] Sage Bionetworks Seattle, WA United States
                [2 ] Centre of Genomics and Policy McGill University Montreal, QC Canada
                Author notes
                Corresponding Author: Christine Suver cfsuver@ 123456gmail.com
                Author information
                https://orcid.org/0000-0002-2986-385X
                https://orcid.org/0000-0001-5078-8164
                https://orcid.org/0000-0003-2383-5978
                https://orcid.org/0000-0002-4510-0385
                https://orcid.org/0000-0001-7004-2722
                Article
                v22i7e18087
                10.2196/18087
                7420687
                32540846
                4fd51028-21c6-40f8-9853-bfaf6d0f5ace
                ©Christine Suver, Adrian Thorogood, Megan Doerr, John Wilbanks, Bartha Knoppers. Originally published in the Journal of Medical Internet Research (http://www.jmir.org), 28.07.2020.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in the Journal of Medical Internet Research, is properly cited. The complete bibliographic information, a link to the original publication on http://www.jmir.org/, as well as this copyright and license information must be included.

                History
                : 2 February 2020
                : 31 March 2020
                : 21 May 2020
                : 11 June 2020
                Categories
                Viewpoint
                Viewpoint

                Medicine
                data management,privacy,ethics, research,data science,machine learning
                Medicine
                data management, privacy, ethics, research, data science, machine learning

                Comments

                Comment on this article