Mapping ICD-10 and ICD-10-CM Codes to Phecodes: Workflow Development and Initial Evaluation

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Background

The phecode system was built upon the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) for phenome-wide association studies (PheWAS) using the electronic health record (EHR).

Objective

The goal of this paper was to develop and perform an initial evaluation of maps from the International Classification of Diseases, 10th Revision (ICD-10) and the International Classification of Diseases, 10th Revision, Clinical Modification (ICD-10-CM) codes to phecodes.

Methods

We mapped ICD-10 and ICD-10-CM codes to phecodes using a number of methods and resources, such as concept relationships and explicit mappings from the Centers for Medicare & Medicaid Services, the Unified Medical Language System, Observational Health Data Sciences and Informatics, Systematized Nomenclature of Medicine-Clinical Terms, and the National Library of Medicine. We assessed the coverage of the maps in two databases: Vanderbilt University Medical Center (VUMC) using ICD-10-CM and the UK Biobank (UKBB) using ICD-10. We assessed the fidelity of the ICD-10-CM map in comparison to the gold-standard ICD-9-CM phecode map by investigating phenotype reproducibility and conducting a PheWAS.

Results

We mapped >75% of ICD-10 and ICD-10-CM codes to phecodes. Of the unique codes observed in the UKBB (ICD-10) and VUMC (ICD-10-CM) cohorts, >90% were mapped to phecodes. We observed 70-75% reproducibility for chronic diseases and <10% for an acute disease for phenotypes sourced from the ICD-10-CM phecode map. Using the ICD-9-CM and ICD-10-CM maps, we conducted a PheWAS with a Lipoprotein(a) genetic variant, rs10455872, which replicated two known genotype-phenotype associations with similar effect sizes: coronary atherosclerosis (ICD-9-CM: P<.001; odds ratio (OR) 1.60 [95% CI 1.43-1.80] vs ICD-10-CM: P<.001; OR 1.60 [95% CI 1.43-1.80]) and chronic ischemic heart disease (ICD-9-CM: P<.001; OR 1.56 [95% CI 1.35-1.79] vs ICD-10-CM: P<.001; OR 1.47 [95% CI 1.22-1.77]).

Conclusions

This study introduces the beta versions of ICD-10 and ICD-10-CM to phecode maps that enable researchers to leverage accumulated ICD-10 and ICD-10-CM data for PheWAS in the EHR.

Related collections

Most cited references 24

Record: found
Abstract: found
Article: found

Is Open Access

Evaluating phecodes, clinical classification software, and ICD-9-CM codes for phenome-wide association studies in the electronic health record

Wei-Qi Wei, Lisa Bastarache, Robert J. Carroll … (2017)

Objective To compare three groupings of Electronic Health Record (EHR) billing codes for their ability to represent clinically meaningful phenotypes and to replicate known genetic associations. The three tested coding systems were the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM) codes, the Agency for Healthcare Research and Quality Clinical Classification Software for ICD-9-CM (CCS), and manually curated “phecodes” designed to facilitate phenome-wide association studies (PheWAS) in EHRs. Methods and materials We selected 100 disease phenotypes and compared the ability of each coding system to accurately represent them without performing additional groupings. The 100 phenotypes included 25 randomly-chosen clinical phenotypes pursued in prior genome-wide association studies (GWAS) and another 75 common disease phenotypes mentioned across free-text problem lists from 189,289 individuals. We then evaluated the performance of each coding system to replicate known associations for 440 SNP-phenotype pairs. Results Out of the 100 tested clinical phenotypes, phecodes exactly matched 83, compared to 53 for ICD-9-CM and 32 for CCS. ICD-9-CM codes were typically too detailed (requiring custom groupings) while CCS codes were often not granular enough. Among 440 tested known SNP-phenotype associations, use of phecodes replicated 153 SNP-phenotype pairs compared to 143 for ICD-9-CM and 139 for CCS. Phecodes also generally produced stronger odds ratios and lower p-values for known associations than ICD-9-CM and CCS. Finally, evaluation of several SNPs via PheWAS identified novel potential signals, some seen in only using the phecode approach. Among them, rs7318369 in PEPD was associated with gastrointestinal hemorrhage. Conclusion Our results suggest that the phecode groupings better align with clinical diseases mentioned in clinical practice or for genomic studies. ICD-9-CM, CCS, and phecode groupings all worked for PheWAS-type studies, though the phecode groupings produced superior results.

0 comments Cited 148 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Comorbidity clusters in autism spectrum disorders: an electronic health record time-series analysis.

Finale Doshi-Velez, Yaorong Ge, Isaac Kohane (2014)

The distinct trajectories of patients with autism spectrum disorders (ASDs) have not been extensively studied, particularly regarding clinical manifestations beyond the neurobehavioral criteria from the Diagnostic and Statistical Manual of Mental Disorders. The objective of this study was to investigate the patterns of co-occurrence of medical comorbidities in ASDs.

0 comments Cited 132 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

PheKB: a catalog and workflow for creating electronic phenotype algorithms for transportability.

Jacqueline C. Kirby, Peter Speltz, Luke V Rasmussen … (2016)

Health care generated data have become an important source for clinical and genomic research. Often, investigators create and iteratively refine phenotype algorithms to achieve high positive predictive values (PPVs) or sensitivity, thereby identifying valid cases and controls. These algorithms achieve the greatest utility when validated and shared by multiple health care systems.Materials and Methods We report the current status and impact of the Phenotype KnowledgeBase (PheKB, http://phekb.org), an online environment supporting the workflow of building, sharing, and validating electronic phenotype algorithms. We analyze the most frequent components used in algorithms and their performance at authoring institutions and secondary implementation sites.

0 comments Cited 118 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Wei-Qi Wei:

ORCID: https://orcid.org/0000-0003-4985-056X

Department of Biomedical InformaticsVanderbilt University Medical Center2525 West End Ave, Suite 1500Nashville, TN, 37203United States1 615 343 1956wei-qi.wei@vumc.org

Journal

Journal ID (nlm-ta): JMIR Med Inform

Journal ID (iso-abbrev): JMIR Med Inform

Journal ID (publisher-id): JMI

Title: JMIR Medical Informatics

Publisher: JMIR Publications (Toronto, Canada )

ISSN (Electronic): 2291-9694

Publication date Collection: Oct-Dec 2019

Publication date (Electronic): 29 November 2019

Volume: 7

Issue: 4

Electronic Location Identifier: e14325

Affiliations

[1 ] Department of Biomedical Informatics Vanderbilt University Medical Center Nashville, TN United States

[2 ] Medical Scientist Training Program Vanderbilt University School of Medicine Nashville, TN United States

[3 ] Centre for Global Health Research Usher Institute of Population Health Sciences and Informatics The University of Edinburgh Edinburgh United Kingdom

[4 ] Public Health and Intelligence Strategic Business Unit National Services Scotland Edinburgh United Kingdom

[5 ] Department of Medicine Vanderbilt University Medical Center Nashville, TN United States

[6 ] Edinburgh Cancer Research Centre Institute of Genetics and Molecular Medicine University of Edinburgh Edinburgh United Kingdom

Author notes

Corresponding Author: Wei-Qi Wei wei-qi.wei@ 123456vumc.org

Author information

Patrick Wu https://orcid.org/0000-0002-1437-6688

Aliya Gifford https://orcid.org/0000-0001-8931-8362

Xiangrui Meng https://orcid.org/0000-0003-4889-4640

Xue Li https://orcid.org/0000-0001-6880-2577

Harry Campbell https://orcid.org/0000-0002-6169-6262

Tim Varley https://orcid.org/0000-0002-6106-568X

Juan Zhao https://orcid.org/0000-0003-1429-0662

Robert Carroll https://orcid.org/0000-0003-3802-8183

Lisa Bastarache https://orcid.org/0000-0003-3020-447X

Joshua C Denny https://orcid.org/0000-0002-3049-7332

Evropi Theodoratou https://orcid.org/0000-0001-5887-9132

Wei-Qi Wei https://orcid.org/0000-0003-4985-056X

Article

Publisher ID: v7i4e14325

DOI: 10.2196/14325

PMC ID: 6911227

PubMed ID: 31553307

SO-VID: 6798be73-191a-4ea0-ac17-321d432a4f17

Copyright © ©Patrick Wu, Aliya Gifford, Xiangrui Meng, Xue Li, Harry Campbell, Tim Varley, Juan Zhao, Robert Carroll, Lisa Bastarache, Joshua C Denny, Evropi Theodoratou, Wei-Qi Wei. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 29.11.2019.

License:

This is an open-access article distributed under the terms of the Creative Commons Attribution License ( https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR Medical Informatics, is properly cited. The complete bibliographic information, a link to the original publication on http://medinform.jmir.org/, as well as this copyright and license information must be included.

History

Date received : 9 April 2019

Date revision requested : 2 July 2019

Date revision received : 3 August 2019

Date accepted : 24 September 2019

Comments

Comment on this article

scite_

Cited by 172

See all cited by

Most referenced authors 619

See all reference authors

Submit your digital health research with an established publisher
- celebrating 25 years of open access