12
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Is omission of free text records a possible source of data loss and bias in Clinical Practice Research Datalink studies? A case–control study

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Objectives

          To estimate data loss and bias in studies of Clinical Practice Research Datalink (CPRD) data that restrict analyses to Read codes, omitting anything recorded as text.

          Design

          Matched case–control study.

          Setting

          Patients contributing data to the CPRD.

          Participants

          4915 bladder and 3635 pancreatic, cancer cases diagnosed between 1 January 2000 and 31 December 2009, matched on age, sex and general practitioner practice to up to 5 controls (bladder: n=21 718; pancreas: n=16 459). The analysis period was the year before cancer diagnosis.

          Primary and secondary outcome measures

          Frequency of haematuria, jaundice and abdominal pain, grouped by recording style: Read code or text-only (ie, hidden text). The association between recording style and case–control status (χ 2 test). For each feature, the odds ratio (OR; conditional logistic regression) and positive predictive value (PPV; Bayes’ theorem) for cancer, before and after addition of hidden text records.

          Results

          Of the 20 958 total records of the features, 7951 (38%) were recorded in hidden text. Hidden text recording was more strongly associated with controls than with cases for haematuria (140/336=42% vs 556/3147=18%) in bladder cancer (χ 2 test, p<0.001), and for jaundice (21/31=67% vs 463/1565=30%, p<0.0001) and abdominal pain (323/1126=29% vs 397/1789=22%, p<0.001) in pancreatic cancer. Adding hidden text records corrected PPVs of haematuria for bladder cancer from 4.0% (95% CI 3.5% to 4.6%) to 2.9% (2.6% to 3.2%), and of jaundice for pancreatic cancer from 12.8% (7.3% to 21.6%) to 6.3% (4.5% to 8.7%). Adding hidden text records did not alter the PPV of abdominal pain for bladder (codes: 0.14%, 0.13% to 0.16% vs codes plus hidden text: 0.14%, 0.13% to 0.15%) or pancreatic (0.23%, 0.21% to 0.25% vs 0.21%, 0.20% to 0.22%) cancer.

          Conclusions

          Omission of text records from CPRD studies introduces bias that inflates outcome measures for recognised alarm symptoms. This potentially reinforces clinicians’ views of the known importance of these symptoms, marginalising the significance of ‘low-risk but not no-risk’ symptoms.

          Related collections

          Most cited references12

          • Record: found
          • Abstract: found
          • Article: not found

          Recent advances in the utility and use of the General Practice Research Database as an example of a UK Primary Care Data resource.

          Since its inception in the mid-1980s, the General Practice Research Database (GPRD) has undergone many changes but remains the largest validated and most utilised primary care database in the UK. Its use in pharmacoepidemiology stretches back many years with now over 800 original research papers. Administered by the Medicines and Healthcare products Regulatory Agency since 2001, the last 5 years have seen a rebuild of the database processing system enhancing access to the data, and a concomitant push towards broadening the applications of the database. New methodologies including real-world harm-benefit assessment, pharmacogenetic studies and pragmatic randomised controlled trials within the database are being implemented. A substantive and unique linkage program (using a trusted third party) has enabled access to secondary care data and disease-specific registry data as well as socio-economic data and death registration data. The utility of anonymised free text accessed in a safe and appropriate manner is being explored using simple and more complex techniques such as natural language processing.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Evaluation of diagnostic tests when there is no gold standard. A review of methods.

            To generate a classification of methods to evaluate medical tests when there is no gold standard. Multiple search strategies were employed to obtain an overview of the different methods described in the literature, including searches of electronic databases, contacting experts for papers in personal archives, exploring databases from previous methodological projects and cross-checking of reference lists of useful papers already identified. All methods available were classified into four main groups. The first method group, impute or adjust for missing data on reference standard, needs careful attention to the pattern and fraction of missing values. The second group, correct imperfect reference standard, can be useful if there is reliable information about the degree of imperfection of the reference standard and about the correlation of the errors between the index test and the reference standard. The third group of methods, construct reference standard, have in common that they combine multiple test results to construct a reference standard outcome including deterministic predefined rules, consensus procedures and statistical modelling (latent class analysis). In the final group, validate index test results, the diagnostic test accuracy paradigm is abandoned and research examines, using a number of different methods, whether the results of an index test are meaningful in practice, for example by relating index test results to relevant other clinical characteristics and future clinical events. The majority of methods try to impute, adjust or construct a reference standard in an effort to obtain the familiar diagnostic accuracy statistics, such as sensitivity and specificity. In situations that deviate only marginally from the classical diagnostic accuracy paradigm, these are valuable methods. However, in situations where an acceptable reference standard does not exist, applying the concept of clinical test validation can provide a significant methodological advance. All methods summarised in this report need further development. Some methods, such as the construction of a reference standard using panel consensus methods and validation of tests outwith the accuracy paradigm, are particularly promising but are lacking in methodological research. These methods deserve particular attention in future research.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: found
              Is Open Access

              The risk of oesophago-gastric cancer in symptomatic patients in primary care: a large case–control study using electronic records

              Background: Over 15 000 new oesophago-gastric cancers are diagnosed annually in the United Kingdom, with most being advanced disease. We identified and quantified features of this cancer in primary care. Methods: Case–control study using electronic primary-care records of the UK patients aged ⩾40 years was performed. Cases with primary oesophago-gastric cancer were matched to controls on age, sex and practice. Putative features of cancer were identified in the year before diagnosis. Odds ratios (ORs) were calculated for these features using conditional logistic regression, and positive predictive values (PPVs) were calculated. Results: A total of 7471 cases and 32 877 controls were studied. Sixteen features were independently associated with oesophago-gastric cancer (all P 5% in patients ⩾55 years was for dysphagia. In patients <55 years, all PPVs were <1%. Conclusion: Symptoms of oesophago-gastric cancer reported in secondary care were also important in primary care. The results should inform guidance and commissioning policy for upper GI endoscopy.
                Bookmark

                Author and article information

                Journal
                BMJ Open
                BMJ Open
                bmjopen
                bmjopen
                BMJ Open
                BMJ Publishing Group (BMA House, Tavistock Square, London, WC1H 9JR )
                2044-6055
                2016
                13 May 2016
                : 6
                : 5
                : e011664
                Affiliations
                [1 ]Medical School, University of Exeter, College House , Exeter, UK
                [2 ]Hoyland House , Painswick, UK
                Author notes
                [Correspondence to ] Sarah J Price; S.J.Price@ 123456exeter.ac.uk
                Article
                bmjopen-2016-011664
                10.1136/bmjopen-2016-011664
                4874123
                27178981
                08181777-16ab-42fb-90a1-5c80b199b65c
                Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/

                This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY 4.0) license, which permits others to distribute, remix, adapt and build upon this work, for commercial use, provided the original work is properly cited. See: http://creativecommons.org/licenses/by/4.0/

                History
                : 24 February 2016
                : 5 April 2016
                : 13 April 2016
                Categories
                Epidemiology
                Research
                1506
                1692
                1696

                Medicine
                primary care,statistics & research methods,oncology
                Medicine
                primary care, statistics & research methods, oncology

                Comments

                Comment on this article